Forum Thread

Need help with Data Set - Regression Analysis project

Kapitalist 4,396 253 May 25, 2011 at 03:28 AM
Hello,

I need to do a Regressioin Analysis project, but first, of course, I need some good ideas AND I need to know where to get some SMALL data sets (100 - 500 observations). I've googled and spent hours looking for small data sets with no luck!! Either the data set has no labels (really sloppy) or its got thousands of observations with coded labels....too much for this project. ...ps: cannot use summary data...need actual data set.

Would really appreciate any help.

4 Comments

1

Sign up for a Slickdeals account to remove this ad.

Joined Nov 2006
disgruntled caveman
28,559 Posts
1,901 Reputation
#2
Email some universities and ask science or psych professors if they have such data sets.
Reply Helpful Comment? 0 0
I heart slickdeals:

$12: 10 (good!) DVDs
$138: Zen X-Fi 32 gb
$4: ToyStory 1&2 BR/DVD + 2x TS3 movie tix
$45: 8 bags M&Ms+ 4Orville 6packs + 2 Redbox +3 blurays+ 2 DVDs+ 4 movie tix+ 1 Bisquick
$262: 50" LED TV
$281.99: mower+ 3 barstool+ 2 tailgate grill+ 6fertilizer+sawzall+4pillows+edger+swimsuit+2WiiU AfterglowPro +2sandals + sprinkler + 50' hose -- SYWR
One happy wife!
Running video game deal list: $155 bought me
3DS: DKCR, ALBW, PkmnY, MarioGolf, Starfox, FE:A
WiiU: NinLand, BatmanAC, AC4, W101, NG:RE, MK8, Pikmin 3, NSLU, 3DWorld, ZombiU
#3
depending on the type of data, a common problem is that often they are poorly annotated or missing data points for whatever reason (old, no curator, lost, etc.).

given that, this is a good website for some data sets: http://archive.ics.uci.edu/ml/index.html. if you're looking for lower # of data points, medical data is a good bet (since it's expensive/time consuming to collect samples); but the problem is there are often missing data points, especially if it's old.

as for project ideas, here are some, professors like to advertise how clever their students are, so try googling terms like 'machine learning class project [google.com]' or 'genomics class project [google.com]' for previous projects.

good program to apply different regression algorithms: Weka http://www.cs.waikato.ac.nz/ml/weka/

last but not least, why not ask your prof/TA? it's their job Smilie
Reply Helpful Comment? 0 0
Last edited by bubbachuck May 25, 2011 at 08:49 AM
GO DUKE!

Pro tip: when you buy a Steam game and don't plan on playing it right away, store it in your inventory instead to keep your options open because it's worthless once it's tied to your account

Amazon pricing algorithm investigated [michaeleisen.org]
Find out your PSU's real manufacturer [hardwaresecrets.com] | More than you care to know about PS3 hardware... [edepot.com]
Joined May 2004
Whats the purpose?
4,396 Posts
253 Reputation
Original Poster
#4
Quote from bubbachuck View Post :
depending on the type of data, a common problem is that often they are poorly annotated or missing data points for whatever reason (old, no curator, lost, etc.).

given that, this is a good website for some data sets: http://archive.ics.uci.edu/ml/index.html. if you're looking for lower # of data points, medical data is a good bet (since it's expensive/time consuming to collect samples); but the problem is there are often missing data points, especially if it's old.

as for project ideas, here are some, professors like to advertise how clever their students are, so try googling terms like 'machine learning class project [google.com]' or 'genomics class project [google.com]' for previous projects.

good program to apply different regression algorithms: Weka http://www.cs.waikato.ac.nz/ml/weka/

last but not least, why not ask your prof/TA? it's their job Smilie

thanks Bubbachuck...I will check out these links. The professor is telling us the project can be on anything we choose! We had some ideas but when we went to gather the data, the data sets were enormous and required SPSS to open (.sav). Im an IST student taking quant for an elective, and the other students (business/strategy) are not stats people or research people (who would primarily use SPSS). I went to the lab to convert the data and wow was it NOT user friendly! ... plus tens of thousands of observations!

2much
Reply Helpful Comment? 0 0
hopefully 2014 is better.
Page 1 of 1
1
Join the Conversation
Add a Comment
 
Copyright 1999 - 2016. Slickdeals, LLC. All Rights Reserved. Copyright / Infringement Policy  •  Privacy Policy  •  Terms of Service  •  Acceptable Use Policy (Rules)  •  Interest-Based Ads
Link Copied to Clipboard