Friday, November 23, 2012

Weighted Graph: Make SNA More Accurate

In the latest three lectures, we are in touched with Social Network Analysis (SNA), which is a key component of the Social Network Research Field. And we get a superficial understanding of the different algorithm defined in SNA by doing some exercise on and after classes. And a question came to my mind afterwards. Did the social graph correctly represent the relations between human beings in SNS?

Let’s have an exercise to find the problem.
We assume that 5 people (From n1 to n5) are in the same SNS and their relationship can be represented by the graph below (Graph1).



Graph1: Normal Graph

Then we will use some algorithms in SNA to analysis it.

Degree Centrality:

Cd(ni)
C’d(ni)
n1
4
1
n2
1
1/4
n3
3
3/4
n4
2
1/2
n5
2
1/2
Cd = 0.667
Closeness Centrality:

Cc(ni)
C’c(ni)
n1
1/4
1
n2
1/7
4/7
n3
1/5
4/5
n4
1/6
2/3
n5
1/6
2/3
Cc = 0.26
Betweenness Centrality:

Cb(ni)
C’b(ni)
n1
7/2
7/12
n2
0
0
n3
1/2
1/12
n4
0
0
n5
0
0
Cb = 0.5625


From the results of SNA, we can get some information:
1. The SNS shown by Graph1 is with high Degree Centrality and Betweenness Centrality.
2. Node n1 is the most important node in this SNS (with highest Cd, Cc and Cd).

However usually the relationships in the real SNS are not such simple, there are thousands types of relationships, for example, Tim and Bill are best friends while Tom and Sissy were just got to know, and obviously the strength of two previous relationships are not the same so I think it is unfair to assign all relations in SNS with the same value 1.

Let’s use the case showed by Graph1 again and adding some additional information: (1) n3 and n4 are best friends. (3) n3 and n5 are old friends. (4) n1 is a friend of n2 (5) n1 was just got to know n3, n4 and n5.

And this time we divide the relations into 4 levels and use a weighted graph to represent their relations again, as shown in Graph2.


Graph2: Weighted Graph

Now do the SNA again with the same algorithms.

Degree Centrality:

Cd(ni)
C’d(ni)
n1
5/4
5/16
n2
1/2
1/8
n3
2
1/2
n4
5/4
5/16
n5
1
1/4
Cd = 0.333
Closeness Centrality:

Cc(ni)
C’c(ni)
n1
1/14
2/7
n2
1/20
1/5
n3
3/37
12/37
n4
3/40
3/10
n5
3/41
12/41
Cc = 0.256
Betweenness Centrality:

Cb(ni)
C’b(ni)
n1
Unfinished
Unfinished
n2
Unfinished
Unfinished
n3
Unfinished
Unfinished
n4
Unfinished
Unfinished
n5
Unfinished
Unfinished
Cb = Unfinished



Due to time limited, I cannot finish the calculation of Betweenness Centrality (Awful complex, I’m afraid (- -#)) and there may be some mistakes in calculation, but the result changes a lot:
1. Degree Centrality of the SNS getting lower, it is about 1/2 of the previous SNA.
2. Node n3 becomes the most important node (with highest Cd and Cc), while in the previous SNA it is n1.

In conclusion, the result completely changed when we take the weight of the relationships into consideration, and I believe using weighted graphs is a more reasonable and humane way to do SNA (If you do not consider the weight of relations to analysis the second case, the result will be unconvincing).


References:
  • Angela Bohn & Norbert Walchhofer & Patrick Mair &Kurt Hornik: Social Network Analysis of Weighted Telecommunications Graphs
  • M. E. J. Newman: Analysis of weighted networks
  • http://toreopsahl.com/tnet/weighted-networks/node-centrality/

8 comments:

  1. Nice try there, seems that the weight can better simulate the relation between user. Somehow your method remind me of the relation "secretly follow" in Weibo... what shall its weight be? But anyway it's a directed graph so maybe we can't discuss it here.

    ReplyDelete
    Replies
    1. It is a very good question, but I think a "secretly follow" should be regard as a normal directed relation in SNA because you will not realize someone is "secretly follow" you and very likely you will not follow him/her.

      Delete
  2. Many students writes this on their blogs, kind of helping their reviewing, right?
    Answering the question above raised by first floor, in my opinion, the "secretly follow" is really small part of the whole SNA graph. If you want to analyze a specific SNA, the majority part of SNA without the small part mentioned above is enough for us, how do you think?

    ReplyDelete
  3. Thank for your example to explain the question"Did the social graph correctly represent the relations between human beings in SNS? ".After reading your article I also think that using weighted graphs is a more reasonable and humane way to do SNA.

    ReplyDelete
  4. woo, your blog is so professional to explain the problem you mentioned in your article. What's more, your References attached in the end also demonstrated your hard work about searching for methods to solve the problem. Thanks for your elaborative explanation.

    ReplyDelete
  5. wow, quite a hard work.
    but, in reality, the most difficult part, i think, is how to get the value of the weigh graph :)
    kind of a semantic stuff. how to define how close and the value to describe that relation, it's such a headache i think :)

    ReplyDelete
  6. @@ wow,that is cool, you give the weight to each node, that must be more precise to figure out the SNA data. The example you used is also good, maybe I should try to use your idea to calculate something.

    ReplyDelete
  7. I agreed with Yupbank, and actually it's about we are using the raw data to do the analysis that we have assume the weighted factor be "1", while we shall assume the popularity of the people/node will be reflected by the communication (links) between him/her with other. Whatever, your idea is interesting

    ReplyDelete