It's a simple assignment, really – answer a research question (i.e., test a hypothesis) using databases available on the Internet using techniques (not necessarily the exact same tools) used in class. You may use Python, Excel – or any other tools you want. This is an open assignment. You can use CSV files, or text files. Your question should be one vaguely related to social science. We'll be pretty lenient about the definition of "social science" but it shouldn't have anything to do with science or engineering (e.g., a hypothesis about stresses on a bridge, or one that requires linear algebra to solve, are probably not going to be acceptable.)
You will prepare a paper (3-5 pages) with graphs, and a PowerPoint presentation to fit in 5 minutes, in teams of two.
Specifically, your paper must have the following:
State the hypothesis being tested.
Identify the independent and dependent variables.
Explain how the variables are operationally defined.
Discuss the validity of the operational definitions (both face validity and construct validity)
Discuss the use of and/or need for a control group or control groups.
Discuss whether the findings can be generalized from the sample used.
Indicate whether your approach (for the story you are considering) fits into the naturalistic observation, correlational approach, and/or experimental approach. It might not fit neatly into any one of these approaches. Explain your decision.
You must describe how you got the data that you're using, and why it's the right data to use for your question.
You must have descriptive statistics for your data you're using.
You can have graphs of your data (if that's helpful and useful) – you can pick what graphs you show and the type (scatter? histogram) in order to best convey your point. You don't have to if you're doing a multiple regression.
You must have inferential statistics to test your hypotheses.
Tell us how you did the analyses in your project.
On July 30, turn in a one page piece of paper with the question you're answering and who your team is. One piece of paper per team. Only two or three people per team. We'll let you know by the next day if it's acceptable.
At the beginning of class on August 2, you are to email a PDF of your paper containing all of the above. Your papers should be at least three pages long and no longer than five pages. It should be double-spaced, 12 point times, with 1-inch margins (top/bottom and left/right).
You will also have to prepare a FIVE minute presentation of your work in PowerPoint. It must be mailed to Mark Guzdial by 9 am on August 4!!! All presentations will take place from a single computer. Use your names in the PowerPoint file name.