First of all I have to admit that I'm a very newbie in statistical science (let's say I know one or two things) and that all of the ideas wrote here are just speculation coming from my mind.
I'm quite obsessed in finding patterns in things.
I noticed that people looking alike have some similar behavior, same physical features.
In genetics with the term linkage we indicate the tendency of certain alleles to be inherited together because they are physically close to one another of because it's important for some reason that they don't assort independently.
Linkage is often used to discover genes related to diseases.
In my opinion we can extend its definition and use it not only for the purpose I've just mentioned above but it could be useful also for discovering protective factors for diseases.
My idea is to see correlation between people's medical history and their:
- sexual life
- family history
- social condition
Since some months I have been thinking about how many data we put into social networks spontaneously and how this data, with relation to our connections (friends, parents and relatives), could form pattern that not a human being but a computer can understand and extrapolate (it's called Data Mining and I just discovered this today).
I still don't know how to sort this out, by now I plan to develop a survey Facebook application.
But with this approach there are some problems:
- Noise: people can answer questions with false information
- Appealing: people wouldn't reply questions for no reason. I have to make the application appealing and develop a game around it (e.g. Find your partner, Who looks like you, ecc...)
- Harshness: Facebook it's just for fun and of course I can't ask: are you obese? are you rich? Questions have to be asked smoothly and indirectly: Do you like eating? What mobile phone have you got?
- from status update it can get information about an user habits, what he eats, sports he practices (why not, even sexual habits)
- from new options of geolocalization it can check places he attends (e.g. where he spends much of his time that can be a gym or a McDonalds)
- it can even understand if he spends too much time on facebook from pc (so he has a sedentary life-style) checking how long is he online (and assessing he is active of course)