Relevant Stata commands:
Pick any number between 1 and 123456789. Type set seed yournumber, where yournumber is the random number you picked. This tells Stata where to start when generating random numbers. By setting a different seed each time you start Stata, you can be sure that you won't sample the same units over and over again. Write your seed value in your answer to the HW problem so that we can reproduce the sample later if we need to.
To draw a stratified sample with simple random sampling in each stratum, we have to partition the population into the strata. We then sample randomly in each strata. For Lohr's Problem 13, I created four populations (one for each stratum) from the agpop data set: agpopW (West stratum), agpopS (South stratum), agpopNC (Northcentral stratum), and agpopNE (Northeast stratum). To take a stratified sample, first decide what fraction, n_h / N_h, will be sampled in each stratum h. Then, follow the steps below:
Load in the data for the West region (type use agpopW). Use sample perc_w to pick a random sample of (perc_w) percent of the N_west units in agpopW. For example, to sample 5% of the units in the West, type sample 5. Save this as a new data set on your home drive (type save temporary).
Now, type clear. Load in the data for the NE region (type use agpopNE). Use sample perc_ne to pick a sample of (perc_ne) percent of the N_ne units in agpopNE. Now, type append using temporary. This adds on the sampled units in the West to the end of the data set containing the sampled units in the NE. Save this data set as temporary again by typing save temporary, replace. The replace option tells Stata to overwrite the old copy of the file temporary.
Repeat the above process for the NC (agpopNC) and S (agpopS)
regions. The last data set that you save will be a stratified, simple
random sample from the population.
To analyze data from a stratified, simple random sample, you need to create a vector of weights. The weight for the units in each stratum is N_h/n_h. Type generate wts = 0 just to get a variable wts in the data set. Now, for all the units in region W, you want to change the weight to N_west/n_west. To do this, look at the editor to see what numbers of observations correspond to the units in the West (for an example, let's say units 64 through 120 are units in the West). Then, type replace wts = N_west/n_west in 64/120. This tells Stata to replace the 0 in wts with N_west/n_west for units 64 to 120. Do similar commands to specify the weights for each region.
You also need a vector of finite population correction factors. This is easy. Simply type generate fpcf = 1/wts.
Finally, to use svytotal or svymean, simply type svytotal varname [weight = wts], fpc(fpcf) strata(region). The strata option tells Stata what the stratum indicators are. Be sure to type these in, as there will be no automatic selection of weights and fpc as there was in the last homework (when I saved the correct weights and fpc so that you automatically used them when typing svymean or svytotal).