**********************************************
*                                            *
* Tanya Byker                                *
* Economics 211 Fall 2024                    *
* Lab #10                                    *
* Tables and Figures                         *
*                                            *
**********************************************
snapshot erase _all

** Set you working directory and open the data:
cd filepath
use filepath.file.dta, clear

*** Descriptives ***

* A trend graph by [education] category

gen inlf=empstat<3

gen edcat=.
replace edcat=1 if educ<6
replace edcat=2 if educ==6
replace edcat=3 if educ>6 & educ<10
replace edcat=4 if educ==10
replace edcat=5 if educ==11

label define education 1 "Less Than HS" 2 "High School" 3 "Some College" 4 "Bachelors" 5 "Graduate", replace
label val edcat education

* Including weights

tab year edcat if sex==1 [w=perwt], sum(inlf) mean noobs
tab year edcat if sex==2 [w=perwt], sum(inlf) mean noobs

* procedure:
  *  Copy > Table
  *  Paste into excel
  *  Insert Line graph
  *  Copy/Paste excel graph into Document

  
* A (small) summary stats table  
  
gen LTHS=educ<6
gen HS=educ==6
gen SC=educ>6 & educ<10
gen BA=educ==10
gen GRAD=educ==11
  
sum LTHS HS SC BA GRAD inlf [w=perwt]
sum LTHS HS SC BA GRAD inlf [w=perwt] if year==1970
sum LTHS HS SC BA GRAD inlf [w=perwt] if year==2017

* another strategy for making a summary table -- learning some more advanced code: collapse!

* collapsing will literally collapse your data, so if you want to be able to get it back, you need to "save" it in memory
snapshot save

collapse (mean) LTHS HS SC BA GRAD inlf [w=perwt], by(year)

snapshot restore 1

* you could also do a two way collapse (by year and sex) to produce summary stats by gender
snapshot save

collapse (mean) LTHS HS SC BA GRAD inlf [w=perwt], by(year sex)

snapshot restore 1

* Descriptive Table considerations
  * ABSOLUTELY NO STATA OUTPUT COPIED INTO YOUR PAPER
  * Don't need 8+ decimal places
  * Don't need to list the sample size over and over


* A scatter plot:

  * examples in:
  *               Lab 7: Philips Curve (labeling points)
  *               Lab 9: Income and Democracy

* Export Stata graphs as .png to easily insert into document

graph export scatter.png, replace 


*** Regression Results ***

gen fulltime_year= 0
replace fulltime_year=1 if uhrswork>=40 & wkswork2>=4 
replace fulltime_year=1 if hrswork2>=5 & year==1970 & wkswork2>=4 
replace fulltime_year=0 if hrswork2<5 & year==1970 & wkswork2>=4 


gen ln_wage=ln(incwage)
gen male=sex==1
gen age2=age^2
tab edcat, gen(ed)


** LOOPS are SUPER useful!

forval i=1/10 {
di `i' 
}

forval i=1970(10)2010 {
di `i' 
}

foreach i of numlist 1970 1980 1990 2000 2010 2017 {
di `i'
}

*ssc install outreg2

local r replace

foreach y of numlist 1970 1980 1990 2000 2010 2017 {
	reg  ln_wage male if fulltime_year==1 & year==`y' [w=perwt], robust
		outreg2 using test1, excel noaster `r' ctitle(`y')
		estimates store _`y'_n
		
	local r append
	
	reg ln_wage male ed2-ed5 age age2 if fulltime_year==1 & year==`y' [w=perwt], robust
		outreg2 using test1, excel noaster `r'  ctitle(`y') 
		estimates store _`y'
}


** Getting Excel Tables into Document

* Click view > Unclick Gridlines
* Copy
* In Word, paste special > Microsoft Excel Binary Worksheet Object


*** Checklist of things in the notes of a Figure or Table:

* What is going on in this figure or table
* Source of data
* Age range? years?
* what is in parentheses (standard errors?)
* what do the stars mean



**  Other (cool) visualizations of regression results


** continuing with the regression above
label var male "gender gap coefficient"
** coefplot is a relatively new user written command that makes visualizations from regression results
** more info here: http://repec.sowi.unibe.ch/stata/coefplot/getting-started.html

** you may need to install it: 
*ssc install coefplot
coefplot _1970 _1980 _1990 _2000 _2010 _2017, keep(male) vertical 


#delimit ;

coefplot _1970 _1980 _1990 _2000 _2010 _2017, 
keep(male) vertical legend(col(1))
title("Evolution of the US Gender Gap")
graphregion(color(white)) bgcolor(white);

#delimit cr
				
* the notes to this figure should explain what variables are controlled for in the regression 
* and note that there are 95% confidence intervals shown




** For this example let's code some variables for race and ethnicity (this is code from Lab 4):

**** Coding categorical variable for race and ethnicity using Census variables.
* NOTE: you need to use both the race and the hispan variables * 
gen white_nh=0
replace white_nh=1 if race==1 & hispan==0

gen black_nh=0
replace black_nh=1 if race==2 & hispan==0
  
gen other_nh=0
replace other_nh=1 if race>2 & hispan==0

gen hispanic=0
replace hispanic=1 if hispan>0

** we should check that the proportions of the groups sum to one

sum white_nh black_nh other_nh hispanic

gen racecat=.
replace racecat=1 if white_nh==1
replace racecat=2 if black_nh==1
replace racecat=3 if other_nh==1
replace racecat=4 if hispanic==1

label define racecat 1 "white_nh" 2 "black_nh" 3 "other_nh" 4 "hispanic", replace
label values racecat racecat

gen college=edcat>=4


** the following command give the predicted line for each race group
reg ln_wage i.college##i.racecat if year==2017
margins college, over(racecat)
marginsplot, name(predicted_lines, replace) title("Predicted Ln(wages) by BA status and Race") subtitle(US ACS 2017) 

graph export predicted_lines.png, replace 



reg ln_wage i.college##i.racecat if year==2017
** the following command gives us the marginal effect of college for each race group - it adds beta_1 to beta_interaction for each race
margins, dydx(college) over(racecat)
marginsplot, recast(scatter)  horizontal title("Marginal effects (returns to college) by Race") subtitle(US ACS 2017) xtitle(return to a college education) ytitle("") name(marginal_effects, replace)

graph export returns.png, replace