Functions: inlist()

I once was asked what is wrong about the code similar to the one below:

gen asean4 = 1 if countryname == “Indonesia” | “Malaysia” | “Philippines” | “Thailand”

This is a common mistake. Understandably, the assumption that repeating the left side of the expression, in this case ‘countryname’, is redundant is not far-off. Alas, Stata requires it and the correct syntax is:

gen asean4 = 1 if countryname == “Indonesia” | countryname == “Malaysia” ///
| countryname == “Philippines” | countryname == “Thailand”

But we can do better by using the built-in function inlist(). Learning a little bit more about Stata’s built-in functions can be very convenient (sometimes necessary)—shorter codes, faster processing, more facebook time. Using inlist(), the equivalent code is:

gen asean4 = 1 if inlist(countryname, “Indonesia”, “Malaysia”, “Philippines”, “Thailand”)

inlist() may also be used for numeric values. For example:

gen asean4 = 1 if inlist(countrycode, 360, 458, 608, 764)

The difference between using numeric and string values is in the number of allowable elements in the list (number of countries in our example). For numeric values, 254 elements are allowed and for string values, only 9. See -help inlist-.

15 Responses

  1. When I use the inlist, it provides the 1 in the expected places but does not provide any zeros. Instead, the non1s are missing values. How do I make it to where if it is not 1 then it is zero, not a missing value?

    • hi marcos,
      i suppose your code looks like gen newvar = 1 if inlist(var, x, y z)?
      if this is the case, newvar is 1 if for x, y, z and missing for the rest.
      you may write gen newvar = inlist(var, x, y z) so that newvar is 1 for x, y, z and 0 for the rest.


  2. Hi All,

    I’ve found this thread very useful. Is there any way to use use the & command with inlist? For example, I have the command:

    edit if (applicant_name ==”TASLEEM BIBI” & applicant_hh_name==”ALLAH YAR”) ///
    > | (applicant_name ==”NAEEM BIBI” & applicant_hh_name==”MUHAMMAD NASEER”) ///
    > | (applicant_name ==”MUNZA PARVEEN” & applicant_hh_name==”KHALID IQBAL”) ///
    > | (applicant_name ==”ZOHRA BIBI” & applicant_hh_name==”RASOOL BUX”) ///
    > | (applicant_name ==”MERAJ MAI” & applicant_hh_name==”NAZIR AHMAD” & Vlg_code==33033)

    This goes on for quite a while (there are 25 applicant names and codes). If not using inlist, is there a way I can simplify the command, by perhaps specifying all the names and village codes first and then having all of that edited (I want to simply view them in the data browser)?

    Thanks a bunch,

    • hi, haseeb. & and | are not allowed within inlist. but you may is it to combine inlist, eg, edit if inlist(var1, val1, valu2) | inlist(var2, val3, val4), keeping in mind how logical operators work. not sure what you wanted to do but if there is an indicator common to all observations you want to show, say a village code, you may use the variable corresponding to that code as filter rather than indicating individual names. hope this helps. -mitch

  3. Hi Mitch

    What would be the code to find values between 2 numbers, say:

    use if inlist(DX1,800, 801,802,803…etc)

    can I do something like

    use if inlist(DX1,800-810)?

    Thank you!

    • Hi, Rafael.

      I am not sure if i understand your question correctly. But based on your example.. it looks like you want to load a subset of your data.

      You may use the -if- or -in- qualifiers for this. See -help use-.

      syntax: use [varlist] [if] [in] using filename [, clear nolabel]

      The function inslist() does not allow range of values as arguments, numbers or strings (but not combined) must be separated by commas.


  4. Amir, next time stata says “expression too long”, you could try introducing “/* */”, this tells stata the command continues on the next line

    replace GH=”1″ if inlist(“Ghana”, country1, country2,/*
    */country3, country4, country5,/*
    */country6, country7, country8,/*

    Hope it helps.

    • thanks. asabere. but i think amir’s problem is that he exceeds the number of arguments allowed with inlist()—10 arguments for strings, including the variable name. one way to go about this is to use a numeric variable as stephen suggested. =D

  5. Amir,
    The number of string entries allowed with inlist is quite small.
    If you’re going to use inlist with all the EU countries, you’d do better to check a numerical variable, if you can use country codes instead of country names. You may find the findit-able command kountry helpful if you have variables with a conventional string version of country names and want to convert them to some conventional numeric code.

    • Thanks Stephen. I think you are right; it is always convenient to replace string variables with numerical codes.

  6. Thanks Mitch! It is convenient but doesn’t work when number of countries in your string example increases. I tried it for EU countries and the message “expression too long” came up :(

  7. -inlist- (and the similar function -inrange-) were commands I learned about at the UK Stata Users Group earlier this year and wished I’d known about sooner!

    Another bonus tip: -inlist- can be used the other way around to check whether a single value appears in one of several variables, e.g.:

    gen thai = 1 if inlist(“Thailand”, country1, country2, country3)

    This way round of using -inlist- is a bit rarer than the first, but still useful from time to time!

  8. Thanks Mitch! :) Agree, so convenient. Also, saw it as Cox’s Stata tip #39.

    Keep it coming :)

Leave a Reply