Tuesday, December 20, 2011

A wedding and a funeral

I began thinking about the Facebook graph.  While, like Tony,  I am disappointed in what they gave us, I have realized there is some useful information.  In this case I am looking at the month of May.  Here we had the royal wedding and the death of Osama. While its not clear, because of the way of the data is presented, I think we can infer some key elements of memes.
  1. These two memes are the only ones overlapping in 2011. 
  2. The marriage begins to spike before Osama's death 
  3. The royal marriage ends its spike just as Osama's death is reported
  4. So we can possibly be seeing one meme stealing the audience of another.  
This theft is important because it shows how one meme spikes at the expense of other conversations.  But until we get data with more fidelity we can only infer that Osama's death stoled audience away from the royal wedding.

The other thing we should take notice of is the type of memes the spikes represented.  They all seemed to be call to celebrate, mourn or pay attention to something.  They evoke an emotional response to share.  

Finally a bit about natural level of conversation.  While we can not see the day to day changes in the conversations we do know they exist both from observation but also from the data.  As Tony pointed out Facebook broke the data into 4 subgroups and each group came with top ten stories.  This gives us 40 stories that did not spike, so we can infer a level that has to be less than Irene spike that stays pretty stable since it can absorb 40+ events without spiking.  

Friday, December 9, 2011

Facebook's top ten memes of 2011

They gave us a link here and you can find the global memology and the US-specific memology.  The Y axis is labelled "Status update mentions ranked by growth (2010 vs 2011)." There are several small upticks in conversation that happened without making them into the top 10.  They show actors, movies, tv shows, and fictional characters' ranks, too.  The added tabs for music, sports, and news.

One might notice how many new conversations were struck up for these top-10 events by examining the difference between pre/post-event chatter and the chatter in the event itself.  Some factors that we can consider influencing the difference include:
  • lurkers making a post for the first time in a while
  • actual discourse
  • people joining Facebook just to talk(?)
  • lots of reposts, copy/pastes, or likes (we do not know that like button hits are counted; are likes status updates or not?)
It is unfortunate that the full range of chatter-- what one would see on a normal basis-- is either so small that it is occluded by the large spikes (logarithmic Y axis would have helped here) , or was not actually included in the data.  Perhaps when they say "2010 vs 2011" they mean subtracted conversations of 2010 from the 2011 data, ignored negative numbers (those would be events in 2010) and showed only peaks above a certain threshold (removed noise).

Oh for the raw data.

Sunday, November 27, 2011

Blog News

Tim and I decided to split this blog up into two:

  • This blog, SMiSC Roundup, tracks news about the DARPA solicitation and related technologies.
  • The sister blog, SMiSC Open Response, holds the discussions Tim and I have about the path to creation of a set of systems that can perform TA1 and TA2 duties.
Happy reading!

Visual Media Reasoning System


Oct 5 calls for a Visual Media Reasoning system to identify the who, what, where, when and noise troops may encounter in a field video or I would guess an image.  While not SMiSC news it could be used as part of our system to convert video into text.  For more information see

http://www.darpa.mil/opportunities/solicitations/i2o_solicitations.aspx

SMiSC Final Submissions

Final submissions for SMiSC was October 11th.  Therefor I expect we should start getting more news on the program in the coming months.

Friday, November 25, 2011

Detecting emotions in voice is interesting: this content could be added to the markup of a speech-to-text conversion (here ).

Saturday, September 24, 2011

Just how much data must TA 1 mine?

Here's a quote from the initial introduction to TA 1 technologies:
TA1 performers will develop automated and semi-automated operator support tools and techniques for the systematic and methodical use of social media at data scale and in a timely fashion 
Since I have been working on TA 2 test systems with just 5k users, and finding 185k posts per fake year at a posting rate p of 0.1 posts/day,  I wondered what the real world has in store for SMiSC.  That is, what is "social media at data scale and in a timely fashion?"

Well here is "data scale" as of   By The Numbers: Twitter Vs. Facebook Vs. Google Buzz
Updates/Posts
  • Facebook status updates: 700 per second
  • Twitter tweets: 600 per second
  • Buzz posts: 55 per second
1355 updates per second, discriminated, categorized, aggregated, and reported on.  A "timely fashion" implies that it is okay to be "behind" by some time, but eventually the system must process everything.  I figure the requirement for maximum delay is set up to give a report on any new/significant meme within our leaders' decision-making cycle so that leaders cannot be outfoxed by a rapidly-spreading strategic message.

Yikes.

Here's stuff just on Facebook (current): FB stats
Twitter doesn't seem to have a similar page.
Couldn't find one for Google+ either.

Friday, September 2, 2011

DARPA's SMiSC Industry Day


On August 2, DARPA held the first SMiSC industry day. It was hosted by Systems Planning Corporation, in Arlington, from 10:00 AM-5:30 PM. The day was divided into two parts.  The morning session was an introduction to SMiSC and the BAA process. The second half of the day consisted of one on one secessions between attendees and Dr. Rand Walzman. Nothing of importance has been reported about this day yet.  Despite this I can bring two bits of information.

First, I noticed the only documents posted by Dr. Rand Walzman is on the IRB process. This leads me to guess that TA2 is of important and they recognize the complications of doing human studies. Second, we were able to obtain a list of attendees. Analyzing it we can see several things. There were 122 people attending divided roughly into 76 different corporations, agencies and educational institutions. Sixty of them were companies, with 30 of them being primarily defense companies the others were non-defense firms.  The remaining were academic or government agencies.  The leading speciality for the defense companies were in simulation, security and enterprise solutions.  The non defense firms looked at social networks, linguistics and data mining.

What we see, at least by this list, is the military is behind the curve when it comes to using social networks as a source of date mining and developing ways to monitor its content.  This would shock many reporters.  Indeed civilian companies have been monitoring the Internet for years to develop not just market strategy but PR and political strategies as well.  It will be interesting to see how the military applies the technology in the coming years.  

Tuesday, August 16, 2011

A little lite listening or not!

For the fun of it check this out http://www.bbc.co.uk/programmes/b00cftq8 its a reading of Pattern Recognition written by the famous writer William Gibson.  While fictional much of it has bearing on SMiSC objectives 

Memes shut down Bay Area Rapid Transit (BART)

What does Twitter, Cell Phone access, Anonymous and BART have in common? Today it is SMiSC. Starting last week, a few people picked up on BART's shutting down their wireless/cell phone connection across their routs. BART did this out of fear of protest over the shooting of a man by a BART officer. Hitting fast forward, we see Anonymous has hacked BART website; blasting the account information of thousands and calling on supporters  to show up Monday to protest the shut down.  BART  then went into damage control as protesters closed in. Transit stations are shut down as protesters demand open access to their cell phones and wireless networks. Gone is the justice shooting victim.  It has been replaced, instead by an overwhelming need, for the protesters to connect with their European and Arab world protesters.  Internet radicals tweet tenuous connections to the Egyptian cell shut down and BART's lack of judgment.

How do I know all this? Not from the news but a series of tweets sent out during a four hour period. My favorite quote “Anonymous carried off a physical denial of service” But it wasn’t Anonymous. It was a series of people seeking to feel part of the years' protesting. These people have been yelling for American's to wake up and join their downtrodden brothers. Today this finally happen. No matter how small it was, the Twitter/Internet radicals joined forces physically to “shutdown” the BART.

Looking at SMiSC premises we see some issues. First how memes may evolve as they come into contact with other memes. Second, we have multiple memes...protestesters, social justice, Anonymous and cellular shutdown. Third, we can see how key people help to transfer memes. 

In the end we see that all of these are related to issues brought up by DARP's introduction.

1. Detect, classify, measure, and track memes and purposeful or deceptive misinformation.
In this case we have three memes that need to be tracked. Looking at Twitter it started with social justice, moved to phone service being cut off, followed by Anonymous. Measuring these, one can estimate that phone service being cut off and linked to global protest movements was the strongest meme.

2. Recognize persuasion campaign structure.
There were two persuasive campaigns being played out Monday. First, the call to protest social injustice that was linked to the global protest movements. Second, was Anonymous' message “down with oppressive regimes."

3. Identify participants and intent, and measure effects of persuasion campaign.
While there were hundreds of of people messaging on the subject...oddly only stations were closed Monday not the cellular/wireless networks...I found two primary meme carriers. One is a famous writer, and the other an editor for an online Internet news magazine. Through these two people, who have hundreds of thousands of followers, flowed messages encouraging the protesters and berating BART. The most direct affect was the connecting of a local issue with the global protest movement.

4. Counter messaging of detected adversary influence operations.
BART did poorly in this area. First they did not foresee the consequences of threatening to shutting down the networks. Second, the fear of Anonymous and protesters turned BART into their own worse enemy. While they kept the networks open Monday, closing down different BART stations, just embolden the protesters. If they would have chosen to under react what was a virtual protest would have passed.

I watched this event in real time. SMiSC would have to be able to do this automatically or with personal. To accomplish this thousands of existing meme's will have to be identified and classified, so computers or analysis can spot old memes in operation or new memes being developed. The latter being the hardest.   I am not so certain the persuasion campaign is that important, unless your talking about an intentional launching of memes in a pattern to create a desired outcome. In the BART example, only Anonymous seemed to come anywhere close to a coordinate meme attack. Even then it was just to cause chaos. While, classifying memes is important, SMiSC needs to identify people who can spread memes. The people are the linch pin of any meme operation. SMiSC needs to either have these people in a database linked to existing memes or find ways to identify them through some measurement (number of followers?).  The capabilities of a meme carrier to infect people maybe more important than the meme. If a meme is launched into only a small social network of a a few dozen it goes no where. If picked up by a carrier connected to thousands  if not millions of people it becomes a movement.   

Saturday, August 13, 2011

Let's shut them down-- SMiSC becomes irrelevant

Well, just when we thought it was worth a 42 million dollar contract, the brits want social media shut down. The Wired blogger makes a comment that the TV needs to go, too.  That's a new twist for an open culture.

Saturday, August 6, 2011

For good or bad?

There are lots of good and bad things about the efforts here; there is no doubt that these tools and techniques can be on the immoral side if used incorrectly.  Social disruptions-as-weapons are like bullets-- it depends on why they are used-- and I can fully imagine a world where the intentional state-to-state social disruption will be viewed as at least a violent act.  There were plenty of comments in this vein at this Wired article.

Meme Bibilography

A nice start for this section is the Wikipedia entry on memes; it has a reference section that's worth looking at.  I've not read many of these works but I intend to do so over time.  Here are some that look juicy and relevant to the SMiSC efforts:

  1. Can we measure memes? by Adam McNamara
  2. Lynch, Aaron (1996), Thought contagion: how belief spreads through society, New York: BasicBooks, pp. 208, ISBN 0-465-08467-2
  3. Godwin, Mike. "Meme, Counter-meme". Wired
Hopefully Tim will add an academic search result list here or in a comment.

Technical Area 3 (TA 3): Algorithm Integration, Test and Evaluation

The TA 3 performer will work with TA 1 performers to develop appropriate performance metrics and develop, execute and evaluate the results of corresponding test and evaluation procedures. Test and evaluation procedures will include red team activity involving strategic communication and influence operations in the closed environment developed by the TA 2 performer.

Technical Area 2 (TA 2): Data Provision/Management

The TA2 performer will create a closed and controlled environment where large amounts of data will be collected and experiments will be performed in support of the development and testing of TA 1 algorithms. One example of such an environment could be a closed social media network made up of two to five thousand people where participants have agreed to conduct a significant portion of their social media based activities within the network and agree to participate in the required data collection and experiments. Such a network might be formed within a single government, industrial or academic organization or across multiple such organizations. Another example of such an environment would be a massively multiplayer on-line role playing game where the use of social media is of central importance to game play and with tens of thousands of players that agree to participate in the required data collection and experiments.

The TA 2 performer will work closely with TA 1 and TA 3 performers to support the type of data collection, experimentation and evaluation required.

The type of data required for SMISC research potentially contains Personally Identifiable Information (PII). The TA 2 performer will be required to certify that no PII for U.S. persons was collected, stored or created in contravention to federal privacy laws, regulations and Department of Defense (DOD) policies. Proposers must address the collection and use of PII, if any, in their technical proposal. PII will not be provided to SMISC from another Government agency or from an outside source.

SMiSC Technical Area 1 (TA 1): Algorithm/Software Development

TA1 performers will develop automated and semi-automated operator support tools and techniques for the systematic and methodical use of social media at data scale and in a timely fashion to:
1. detect, classify, measure and track the
  • formation, development and spread of ideas and concepts (memes) and,
  • purposeful or deceptive messaging and misinformation;
2. recognize persuasion campaign structures and influence operations across social media sites and communities;
3. identify participants and intent, and measure effects of persuasion campaigns; and
4. counter messaging of detected adversary influence operations.
TA 1 performers are required to define and validate appropriate performance metrics for algorithms and techniques developed. This will require TA 1 performers to also develop measures of the effectiveness of strategic communication and influence operations.

Discussing the DARPA Social Media in Strategic Communications Initiative and Related Technologies

The US DARPA has released the Social Media in Strategic Communication solicitation.  The solicitation is located here: .  In case you are just passing through, this is the summary:

"DARPA is soliciting innovative research proposals in the area of social media in strategic communication. Proposed research should investigate innovative approaches that enable revolutionary advances in science, devices, or systems. Specifically excluded is research that primarily results in evolutionary improvements to the existing state of practice"
This blog will focus on the solicitation, its efforts and products, and the technologies related or adjacent to this work.

Hopefully, our posts won't be so dry that the blog becomes too journal-like.