Dynamical Endo-Exo Search

We would like to further develop our research to use it as a new technique for identifying relevant content based on the content's dynamic evolution. This method of searching differs from more traditional methods employed by Google and others. Those techniques use a snapshot of the link-weighted structure of the web to determine relevance whereas our technique is based on extracting information based on the user-generated dynamics. This technique amounts to identifying the "shape" of the dynamics as a way of determining how "ripe" the community was for the consumed content.

This approach is based on measuring the relaxation dynamics following a burst of activity contain information about the quality of the content, in addition to revealing the susceptibility of the community to this content. Categorization is based on a model of an epidemic branching process. Our approach uses the memory of past dynamics of popularity to construct a measure of quality content. This is different from existing technologies such as those employed by Google for instance, which create a ranking based on the static, instantaneous snapshot of webpage popularities. We have developed a method for extracting information about the quality of a video (or other content) based on an analysis of the time evolution of the view count (or sales in the case of books, movies, etc). This is based on measuring the relaxation signature following a burst of viewing activity, as well as the precursory time dynamics before the peak. This dynamical signature depends on the susceptibility of a community to particular content, in addition to the underlying mechanism that generated the burst of activity - which is not required to be known a priori. The dynamical evolution can be rationalized in the context of a simple model of an epidemic branching process, and classification of the quality of the content can be made.

The technical problem is to identify and sort "quality" content from "junk", given a massive database consisting of millions of videos. While "quality" and "junk" are subjective measures, that depend on individual preferences, our solution makes this objective in the context of the community of viewers that have already viewed the content. Basically, we extract a classification of the qualities of the content of videos, internet posts, etc, by using the time-dependent effective voting system provided by the dynamical action of internet users. The solution to this problem is based on a model that distinguishes between endogenous and exogenous sources that result in a burst of activity. In the context of videos this amounts to a distinction between views generated by word-of-mouth/viral spreading versus those generated by marketing/advertising/featuring. The model provides classification based on the exponent governing the relaxation of the view count. A more technical description is provided in the sup-porting materials. We have successfully tested these ideas using a massive database tracking the time-series of the daily view count for almost 5 million videos. In addition to what we have described here, one can imagine combining this technique with demographic information to form an even more complete picture about the quality of the content for various communities.

Dynamic PageRank

The "standard" pagerank search gives the solution of a static self-consistent system of linear questions (with a huge dimensionality). In contrast, our idea is to emphasize the endo-exo dynamics. Our strength is the dynamics, our weakness is that the approach is monovariate, ie does not use the self-consistent multivariable approach explicit in pagerank. The strength of pagerank is the multivariable selfconsistency. I propose to combine the two, by developing a dynamical version of pagerank which combines the endo-exo power law approach, by allowing for nonlinear acceleration and relaxation. How to do this in practice?

Track Narratives - Dave Snowden

In the quest and great challenge to index and search video and movie content we could try to use the blogs and all narratives found on the internet and beyond that refer to a given target video. Then, we can use our technique and others on these narratives, for instance by aggregating these narratives both with our voting techniques (time dynamics) and the measures developed by Cynefin (Snowden). The key here is to transform the problem of video content indexing and quality ranking and relevance by using the power of the crowds, but not simply by counting the numbers (as we do) but by using the comments that fall on the internet.

A similar technique has already been used by Stefan Frei to reconstruct the time dynamics of exploits and patches before and after vulnerability disclosure. For this, he has sent thousands of spiders and other bots to interrogate all possible sites on the Internet to reconstruct the knowledge known to the Internauts as a whole on EACH given possible target (a given software and a give vulnerability of that specific software). This is what we envision for searching videos.