# Wrap up [11/19]

A classic example of how people belonging to different professions solve the same problem in their own distinctive ways. The task is to paint 4 walls of the room given a bucket of paint, which has sufficient amount of paint just to paint two walls of the room. This task was given to an Engineer, a Physicist and a Mathematician. Now we see how each of them goes about solving this problem.

The Engineer had painted two out of four walls and as expected he had emptied the bucket of paint, followed by the Physicist’s turn to paint the walls. The Physicist never started to paint the walls as he was trying to figure out the calculation of how best to use the paint bucket so as to paint all 4 walls of the room. However, the Mathematician had painted all 4 walls of the room and surprisingly also had the bucket of paint intact. Now let us reason out the approaches each of them had taken in painting the walls.

The Engineer never really thought about if the given quantity of paint would be enough to paint all 4 walls of the room. So he went about painting the walls until the paint was exhausted. As we saw earlier, the physicist was taking time to figure out the calculations before he actually started painting the walls. The reason being, a physician would always work towards computing the laws of the system and hence a lot of thought is required before solving the problem. On the other hand, the mathematician just painted the rational numbers on all 4 walls (countable infinite). This example shows that different people deal with problems in different ways.

The dataset that tracks the location of the people is useful in many aspects. One of the papers is to model spreading of epidemics like Malaria through human movement. This study focuses on human movement from poor employment regions to rich employment regions and vice versa, as people tend to reside at places that are less expensive i.e., poor employment regions. So, the stochastic modeling of population mobility will help us identifying the cause of Malaria diffusion in social network.

Similarly, another paper deals with “Poverty Analysis in Senegal” wherein the information flow in the network will play a significant role in determining the poverty of Senegal. The different areas in Senegal are represented as nodes of a virtual network. If a particular area is poor, then that node in the network would be least visited. So, we can eradicate poverty and also identify areas that are poor by implementing better models of information flow in the network. The researcher has used the Google page rank approach in ranking the poverties of different cities in Senegal.
Consequently, the price variation of Millet was captured in Senegal using a satellite map of production. The researcher again focused on information flow and deduced that the Millet would be fairly priced in all regions of the country only if information flow is handled well.

The poverty is a multidimensional entity wherein poverty can be measured in terms of wealth, education, healthcare and so on. So, we need to count in a number of factors to determine poverty. To find answers to these factors, a census must be conducted across all the places at Senegal. This involves a large number of social resources like people, time, travel, cost etc. Even then, it is very hard to cover the entire population or vast population of the country by conducting surveys. In order to overcome the shortcomings of surveys, we can use mobile phone dataset as a suitable replacement. The reason being, in this present generation almost 96% of the world uses a mobile phone and the connectivity of mobile phones have reached to such an extent that even people residing in remote villages could be reached. This aids in collecting survey results at a very fine level. The important factors in mobile phone datasets are location/mobility traces of people, interaction of cities/communities amongst others.

Another paper that details the “Survey results on mobile phone datasets” focuses on the fact that inferences drawn must be generalized to larger extent of society and less use of social resources. The researchers gather the Call Data Records (CDR’s) – How, when and with whom communication happened that includes message and call data.
Based on these CDR’s, a social network was constructed with nodes, as people and a link exist between two people if they have reciprocity in communication between them. Here, we can either implement a directional link to indicate source as the caller and destination as callee or associate a weight to the unidirectional link indicating the number of times communication happened between two entities.

The researchers used the mobility traces of people to determine the frequent locations visited by them. Also, the focus was on information diffusion in the network to represent problems such as spreading of epidemics, spreading of viruses through Bluetooth or multimedia messages, viral marketing amongst others. The behavior of a community was studied based on the digital signature of a city or a group of people. It was noticed that the people belong to the same community tend to communicate amongst themselves rather than communicating with people outside their community. This also brings in the influence of ethnicity in the population behavior.
The Gravity law states that the probability that two people are connected decreases with increase in spatial distance separating them. Secondly, the duration of call increases with increase in separating spatial distance. Both these points are valid only until a threshold value is reached in distance separating them. After which, the values become constant or very less changes are displayed.

Another interesting observation made in the above paper was duration for people to be in phone contact. As per the researchers, on an average people relocate every 7 years and when relocated they get to meet new people and henceforth-new contacts will be made. In order to maintain the contacts in the phone, some of the old unused contacts would be deleted.
Now we see how people benefit from contacting the weak ties for employment opportunities. The strong ties (people with whom we communicate on a daily basis) know pretty much the same information/contacts and hence not much benefit would be seen in terms of new job opportunities. However, when weak ties are contacted the network grows beyond the scope of know information and chances are high that relatively more job opportunities can be explored.

The paper also talks about the usage of social networks in disasters. If an emergency situation occurs (bomb blast/plane crash/earthquake), it is noticed that the eyewitnesses are contacted immediately after these events. Also, by analyzing the social amplifiers (nodes with highest degree) and its immediate neighbors it is possible to detect the emergencies. However, work has to be done in predicting the emergencies by using this social network.

By studying the mobility traces of people, it helps in determining the economy of the country. The more mobile the better economy and few larger airtime purchases indicate better economy. Consequently, the mobility of people helps in Urban Planning (improving the roads in developing countries) and also in better planning of transportation network to ease traffic congestion.

In spite of all the above benefits from mobile phone dataset, the privacy issues are often challenging. It is important for the service providers to not reveal the personal information of users like phoning number and age. Hence, suitable anonymization techniques must be employed to preserve the identity of the user.

Dwelling into the social dynamics aspect, the models such as flocking model, language model, culture model are used where in the impact of flock movement is evaluated on real-time problems. For example, the herd movement of people impacting the stock prices in stock market. Secondly, the researchers talk about interaction of particle systems to check if a generalizable claim can be made on the observed pattern across all datasets.
Netlogo is another language used primarily for simulation, used for generating game theory model in the formation of network and to facilitate information exchange so as to maintain fairness in the social network.

Urban dynamics is one interesting topic in social dynamics field. There are several good textbooks on this topic such as “Cities and the Complexity” and Urban Dynamics. Researchers used agent-based model to simulate the movement of people, resources, populations and so on and got some interesting conclusions. There are also some soft wares for simulating urban dynamics. One example is UrbanSim, where the urban dynamic is simulated at the agent and individual level. SimCity, which is a famous game, is an interactive simulation of urban dynamics.

Stochastic process can be also used to model the dynamics in system biology and chemical reaction. By discretizing the time, we can make inference about the concentrations of different chemicals we cannot observe. One interesting thing is the similar methodology here can be used to study the asymptotic behavior of our social systems because there are simple corresponding relationships between people of different and different chemicals.

Financial Modeling is also a field full of stochastic process. One famous process is Brownian process. Brownian process is continuous but does not have derivative. This means that the increment is always small but when the time interval goes to zero, the change will be dramatic. An advanced process named Levy process, which is also widely used in this field, is a combination of Brownian motion and jump process. In contrast with the previous dynamic model, there are currently no exact inference algorithms for these processes. The common tool is simulation.

The most used tools in this course for inference in dynamic modeling is variational inference method, which has strong relationship with the law of large numbers and the large deviation theory.