Tumblr is where tens of millions of creative people around the world share and follow the things they love.Sign up to find more cool stuff to follow
WWW inventor: HTML5 will make Minority Report look like child’s play | Silicon RepublicOpen data heralds the internet’s next exciting phase
Considered one of the pioneering fathers of the internet, Berners-Lee believes we are only at the dawn of an even more exciting era - the era of open data and the semantic web, where almost every feasible physical device or piece of data will be interlinked online.
“The semantic web vision has taken a long time to come to fruition because the web is so exciting in many other ways,” says Berners-Lee, who has been driving new metadata labelling formats to make everything linkable.
This brings us on to the next big revolution - open data - and he says governments and businesses are at the forefront of opening up datasets for individuals, citizens and other businesses to make more informed decisions.The future web we are about to see will be one in which data and devices everywhere will be interlinked and metadata is central to this - effectively who owns A or B in the same way individuals own the deeds to their homes but the difference is allowing this data to be usable and open.
He cites the corner boxes you see on Wikipedia, for example, as a case of how databases and datasets can be globally linked.
Redis based triple database
Meshin application relies on back-end triple store for holding person semantic index and for processing front-end queries. This back-end store is built on top of open source in-memory key-value database Redis. Before getting into details of how triples are represented and queried I will briefly introduce essential Redis features. Fill free to skip following paragraph if you are familiar with Redis.
Redis is key-value store where keys are binary strings and values can be either simple binary strings or higher order data structures. These data structures include ordered lists, unordered unique sets, secondary level hash tables (hsets) and weight sorted sets (zsets). Redis exposes its functionality through simple text based protocol. Protocol defines number of commands and corresponding replies. Commands are either general for all kinds of key-values or specialized for value type. For example SET A X associates binary string value X with key A.
Web 3.0 is Silky?
Before I begin, allow me to acknowledge that I DO recognize how asinine it is to attempt to organize technology and its evolutions in a neat system of points (1.0 was the age of portals, 2.0 the age of Social, etc.). Clearly, the discussion/landscape is much more nuanced.
That being said, many predict that the future Web (“Web 3.0”) is a semantic one, driven by a tremendous amount of Data. An age where the lines between tech and human awareness are blurred. An age where our machines often know us better than our own mothers (a bit unsettling, I know). An age where our technological warlords force us to do their bidding (okay, that one’s a Sci-Fi fantasy…could happen, though).
We’ve seen manifestations of this future through behaviorally targeted ads that follow us across our browsing experience. Search engines that can predict our queries before we complete our first word, let alone sentence. Social applications that know an astonishing amount about us and our closest friends (or frienemies in some cases).
Our e-commerce experience has also been shaped by this “subtle” form of technological stalking. For many of us, sites such as Amazon are at the centerpiece of our digital shopping universe. We go there to buy music, movies, electronics, clothing, toiletries…..what doesn’t Jeff Bezos freaking sell?!
Beyond pushing merchandise of all flavors, the foundation of Amazon is, of course, their powerful suggestion engine. As we browse items, add to our carts, make purchases- Amazon is tracking us all the way. The data collected is used to develop robust, personal user profiles, which allow Amazon to suggest very relevant items, offers, etc. for US (based on MY interests, and the interests of others like ME).
From a practical standpoint, this helps Amazon drive incremental sales and revenue. In essence, they are able to bring to the surface products that a user would be interested in, but may have otherwise missed (pushing items vs. relying exclusively on pull). Moreover, the amount of data that they have generated on individual users like myelf, allows Amazon to “get us” in ways that very few platforms/companies can.
Amazon Gets Silky
Recently, Amazon announced their version of an Android tablet, the Kindle Fire. This moment was compelling news on many different fronts:
- Amazon was diving head-first into the Android pool; not simply dabbling in the shallow Software-end, but fearlessly treading into the deep-end of offering their own device solution.
- The Fire would be priced aggressively, making it the “potential iPad killer” of the month (still not buying that one).
- The device would be chock-full of all of Amazon’s proprietary content goodness: Kindle books, Amazon MP3s, etc.
And then there was Silk, Amazon’s own browser, created specifically for their tablet. Beyond the shock of the company moving into the browser game, Silk promised a few very revolutionary (or potentially revolutionary), Mobile browsing features.
The majority of media outlets/curious end users focused on the performance elements. Unlike other browser offerings, Silk would render pages in dual-fashion. Part of the heavy lifting would occur via the Cloud, with certain elements of web pages being delivered courtesy of Amazon’s own servers. Part would come from local rendering on the Fire itself. This tandem effort would allegedly increase speed of loading, making our Mobile browsing experience that much better (and all of us happy, Mobile campers).
I, myself, gravitated towards a small, but potentially monumental revelation: “Silk will also predict your browsing habits”. Now, this feature certainly is tied to the performance element as much as anything. By anticipating the pages a user is likely to visit next, the browser is able to pre-load; again increasing the speed/quality of experience.
However, a more potentially powerful use/reality exists.
If I were to visit ESPN.com, Silk would theoretically be able to determine that I am a football fan first and foremost, and that my favorite team is the Denver Broncos (Tebow Time!). As a result, it could make the accurate presumption that I would be looking for news/articles specific to those interests, and preemptively serve me the appropriate pages/content.
If I were visiting a restaurant’s site, Silk may recognize based on previous browsing behaviors (foodie blog visits, other restaurant sites, etc.), that I have a weakness for a great burger. It could then highlight the menu section/food descriptions that would best satisfy this culinary preference.
Certainly this capability already exists at the individual site level. Our web experiences are often customized based on cookies or user profiles that are activated by the sign-in. However, there is no underlying thread that allows for ubiquitous/consistent personalization as we move through disparate properties. Though Facebook has very much attempted this (and succeeded to a certain degree via Open Graph), the most seamless/all-encompassing unification would likely occur at the browser level (as it is the foundation/constant of our Web experience).
Aside from the browser that Amazon now has, it has also:
- Accumulated a significant amount of data and behavioral insights on its own site property (as we discussed earlier).
- Developed a sophisticated engine/algorithm to make use of said data/insights (as also discussed earlier).
Combine all of these ingredients, and Amazon has all of a sudden put itself in a position to not only compete in “Web 3.0”, but potentially lead the charge. Historically, the tendency has been to view the company as a digital provider of tangible goods- I, for one, believe that it is time to drastically alter that opinion, Folks.
“The internet can solve many of the difficulties in “knowing your farmer.” Much of the information asymmetry in food exists because a cost effective way to make information travel with the food hasn’t been applied. It is starting to happen in silos, for flour, for chocolate, and for wool, but soon the full power of the semantic web will be applied to food. Mobile pervasiveness is eliminating the boundary between offline and online, enabling seamless access to information. Open participation by all interested parties—whether consumers, producers or distributors—can democratize the sourcing, verifying and sharing of information.”—Anthony Nicalo on hacking the food system and how the semantic web could eliminate information asymmetry in the “know your farmer” problem (via)
Core Ontology Pattern and Visualization Index
Semantic Web portal dedicated to ontology design patterns (ODPs).
NeOn Toolkit: ontology engineering environment
The Stanford parser: a statistical parser (version 1.6)
The FDG parser: a statistical parser (version 3.7)
The Charniak parser: a statistical parser (version 1.0)
Penn Tree bank Project
Visualizing Domain Ontology using Enhanced Anaphora Resolution Algorithm
1. Ontology-based semantic matchmaking approach
Gao Shu, Omer, F. Rana, Nick, J. Avis, Chen Dingfang (2007)
Elsevier, Advances in Engineering Software vol. 38, pp. 59-67.
2. Ontology based multiperspective requirements traceability framework
Namfon Assawamekin, Thanwadee Sunetnanta and Charnyote Pluempitiwiriyawej (2009):
Knowledge and Information Systems journal, Springer –Verlag London.
3. An ontology-based approach for traceability recovery
Zhang,Y.,Witte, R., Rilling, J. et al (2006):
In the Proceedings of the 3rd international workshop on metamodels, schemas, grammars, and ontologies for reverse engineering (ATEM 2006), Genoa, pp 36–43.
4. Recovering traceability links between code and documentation
Antoniol, G., Canfora, G., Casazza, G. et al, (2002):
IEEE Trans Softw Eng, vol. 28, no.10, pp. 970–983
5. Ontologies for knowledge management: an information systems perspective
Jurisica, I., Mylopoulos, J., Yu, E., (2004):
In Knowl Inf Syst, vol.6, no.4,pp. 380–401.
6. The role of ontologies for an effective and unambiguous dissemination of clinical guidelines
Pisanelli, DM., Gangemi, A., Steve, G. (2000):
In Knowledge Engineering and Knowledge Management. Methods, Models, and Tools, Dieng R, Corby O (eds). pp. 129–139.
7. Efficiency of ontology mapping approaches
Marc Ehrig and Steffen Staab (2004):
In International Workshop on Semantic Intelligent Middleware for the Web and the Grid at ECAI 04, Valencia, Spain.
8. Evaluating ontological decisions with OntoClean
Guarino, N., Welty, C. (2002):
Commun ACM vol. 45, no. 2, pp. 61–65.
9. A framework for ontology integration
Calvanese, D., De Giacomo, G., Lenzerini, M. (2001):
In Proceedings of the 2001International Semantic Web Working Symposium (SWWS 2001) CA, USA..
10. Some tools and methodologies for domain ontology building
Aldo Gangemi (2003):
Wiley InterScience, Comp Funct Genom,vol. 4, pp. 104–110.
11. Understanding natural language
Winograd, Terry, (1972):
New York: Academic Press.
12. Towards An Annotated Database For Anaphora Resolution
Delmonte, R.,Chiran, L. and Bacalu. C. (2000):
LREC, Atene, pp.63-67.
13. A ranking approach to pronoun resolution
Denis, P. and Baldridge, J. (2007):
In the Proc. Of IJCAI 2007.
14. Resolving anaphoric references on deficient syntactic descriptions
Stuckardt, Roland. (1997):
In the Proceedings of the ACL’97/EACL’97 workshop on Operational factors in practical, robust anaphora resolution, 30-37. Madrid, Spain.
15. A Hybrid System for Summarization and Question Answering
Delmonte, R.: Getaruns (2003) :
16. Comparing Knowledge Sources for Nominal Anaphora Resolution
Katja Markert, Malvina Nissim (2005) :
Association for Computational Linguistics, vol. 31, no.3.
17. An algorithm for Pronominal Anaphora Resolution
Shalom Lappin, Herbert J.Leass (1994):
Association of Computational Linguistics.
18. Evaluating auotamted and manual acquisition of anaphora resolution strategies
Aone, Chinastu and Scott Bennet (1995) :
In proceedings of the 33rd Annual Metting of the Association of Computational Linguistics (ACL’95), pages 122-129.
19. Coreference for NLP applications
Morton, T. S. (2000):
In Proc. of ACL 2000.
20. TERMINAE: a linguistic-based tool for the building of a domain ontology
Aussenac Gilles, N. & Biebow B, Szulman S. (1999).
In EKAW’99 Proceedings of the 11th European Workshop on Knowledge Acquisition, Modelling and management: LCNS, Berlin, Springer-Verlag, (pp.49-66).
21. TextOntoEx: Automatic Ontology Construction from Natural English Text
Mohamed Yehia Dahab, Hesham A. Hassan & Ahmed Rafea (2006)
AIML 06 International Conference, Sharm El Sheikh, Egypt.
22. (ONTO) Agent: An ontology-based WWW broker to select ontologies
Arpirez JC, Gomez-Perez A, Lozano A & Pinto HS (1998)
ECAI’98 Workshop on Applications of Ontologies and Problem-Solving Methods : Brighton, (UK), (pp 16-24).
23. Stanford typed dependencies manual
Marie-Catherine de Marne e and Christopher D. Manning, (2010)
Stanford Parser Library.
Semantic Web and Enterprise Architecture
MIT Technology Review, 29 October 2007 in an article entitled, “The Semantic Web Goes Mainstream,” reports that a new free web-based tool called Twine (by Radar Networks) will change the way people organize information.
Semantic Web—“a concept, long discussed in research circles, that can be described as a sort of smart network of information in which data is tagged, sorted, and searchable.”
Clay Shirky, professor in the Interactive Telecommunications Program at New York University says. “At its most basic, the Semantic Web is a campaign to tag information with extra metadata that makes it easier to search. At the upper limit, he says, it is about waiting for machines to become devastatingly intelligent.”
Twine—“Twine is a website where people can dump information that’s important to them, from strings of e-mails to YouTube videos. Or, if a user prefers, Twine can automatically collect all the web pages she visited, e-mails she sent and received, and so on. Once Twine has some information, it starts to analyze it and automatically sort it into categories that include the people involved, concepts discussed, and places, organizations, and companies. This way, when a user is searching for something, she can have quick access to related information about it. Twine also uses elements of social networking so that a user has access to information collected by others in her network. All this creates a sort of ‘collective intelligence,’ says Nova Spivack, CEO and founder of Radar Networks.”
“Twine is also using extremely advanced machine learning and natural-language processing algorithms that give it capabilities beyond anything that relies on manual tagging. The tool uses a combination of natural-language algorithms to automatically extract key concepts from collections of text, essentially automatically tagging them.”
A recent article in the Economist described the Semantic Web as follows:
“The semantic web is so called because it aspires to make the web readable by machines as well as humans, by adding special tags, technically known as metadata, to its pages. Whereas the web today provides links between documents which humans read and extract meaning from, the semantic web aims to provide computers with the means to extract useful information from data accessible on the internet, be it on web pages, in calendars or inside spreadsheets.”
So whereas a tool like Google sifts through web pages based on search criteria and serves it up to humans to recognize what they are looking for, the Semantic Web actually connects related information and adds metadata that a computer can understand. It’s like relational databases on steroids! And, with the intelligence built in to make meaning from the related information.
Like a human brain, the Semantic Web connects people, places, and events seamlessly into a unified and actionable ganglion of intelligence.
For User-centric EA, the Semantic Web could be a critical evolution in how enterprise architects analyze architecture information and come up with findings and recommendations for senior management. Using the Semantic Web, business and technology information (such as performance results, business function and activities, information requirements, applications systems, technologies, security, and human capital) would all be related, made machine readable, and automatically provide intelligence to decision-makers in terms of gaps, redundancies, inefficiencies, and opportunities—pinpointed without human intervention. Now that’s business intelligence for the CIO and other leaders, when and where they need it.