We describe how to use the R package NlsyLinks
with the SAS program. This is a replication of the analyses from the ACE Models with the NLSY vignette section called “Example: DF analysis with a univariate outcome from a Gen2 Extract”.
SAS Code:
/*import csv */
DATA LinksFromRPackage;
INFILE "E:/links.csv" DSD LRECL=1024 DLM=',' FIRSTOBS=2;
LENGTH RelationshipPath $14;
INPUT ExtendedID Subject1Tag Subject2Tag R RelationshipPath $;
IF RelationshipPath="Gen2Siblings" THEN OUTPUT;
RUN;
Note that for this to run missing values must be .
as SAS specifies them not NA
, the default R missing value code. The file links.csv
with the path E:/links.csv
can be exported from the NlsyLinks
R package with the following R code.
### Begin R Code to export links
require(NlsyLinks)
dlink <- subset(Links79Pair, RelationshipPath="Gen2Siblings")
fp <- file.path(path.package("NlsyLinks"), "extdata", "Gen2Birth.csv")
getwd() # Run this line to find out where files were saved
dout <- ReadCsvNlsy79Gen2(fp)
write.csv(dout, file="outs.csv", row.names=FALSE, na=".")
write.csv(dlink, file="links.csv", row.names=FALSE, na=".", quote=FALSE)
### End R code to export links
Once the linking file has been exported from R, the SAS code mentioned previously can be run to read the linking data into SAS. The next few lines of SAS code read in the outcome data. Some outcome data can be obtained from the NlsyLinks
package, but usually the NLS Investigator website will be the source of the outcome data.
DATA OutcomesFromRPackageOrYou;
INFILE "E:/outs.csv"
DSD LRECL=1024 DLM=',' FIRSTOBS=2;
INPUT SubjectTag SubjectID ExtendedID Generation SubjectTagOfMother
C0005300 C0005400 C0005700 C0328000 BirthWeightInOunces C0328800;
IF BirthWeightInOunces < 0 THEN BirthWeightInOunces = .;
IF BirthWeightInOunces NE . AND BirthWeightInOunces > 200
THEN BirthWeightInOunces = 200;
RUN;
The user could change the INFILE
to something like the following where these example NLSY data are stored in the R package.
INFILE "C:/Program Files/R/R-2.14.2/library/NlsyLinks/extdata/Gen2Birth.csv"
Other data manipulations in SAS could then be done, followed by saving the desired data as csv and finally running analyses in R after reading in this new csv made from SAS.
Mike, I think the Markdown above has all the examples you use later, except for output. It probably makes sense to use the same code block formatting, just as if it were input code.
SAS Output:
The SAS System 16:23 Sunday, January 19, 2014 8
Merge by ID2
The REG Procedure
Model: MODEL1
Dependent Variable: BirthWeightInOunces_1c
Number of Observations Read 22176
Number of Observations Used 17440
Number of Observations with Missing Values 4736
NOTE: No intercept in model. R-Square is redefined.
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 2 1349821 674911 1583.84 <.0001
Error 17438 7430721 426.12230
Uncorrected Total 17440 8780542
Root MSE 20.64273 R-Square 0.1537
Dependent Mean -0.09445 Adj R-Sq 0.1536
Coeff Var -21855
Parameter Estimates
Parameter Standard
Variable DF Estimate Error t Value Pr > |t|
BirthWeightInOunces_2c 1 0.17766 0.02308 7.70 <.0001
R_times_BirthWeightInOunces_2c 1 0.50416 0.05313 9.49 <.0001
This concludes the vignette on using the SAS with the NlsyLinks
package.