Every tech community has a website that aggregates news about a particular topic. These are called <X>Planet where <X> is the name of the tech topic. For example, there is a PythonPlanet, PlanetRDF etc. For example, here are a few …
Every tech community has a website that aggregates news about a particular topic. These are called <X>Planet where <X> is the name of the tech topic. For example, there is a PythonPlanet, PlanetRDF etc. For example, here are a few …
From this Incremental Knowledge Discovery in Online Social Media by Xuning Tang In light of the prosperity of online social media, Web users are shifting from data consumers to data producers. To catch the pulse of this rapidly changing world,…
This class definition may not fly in any programming language I know However, if you liked this, you will love this presentation – What is a Data Scientist?
From Google Research Blog - Learning from Big Data: 40 Million Entities in Context When someone mentions Mercury, are they talking about the planet, the god, the car, the element, Freddie, or one of some 89 other possibilities? This problem is called disambiguation (a word that is itself ambiguous), and…
Coursera has a nice course on Natural Language Processing. I missed it when it started, so catching up now viewing the archives. What makes Natural Language Processing Difficult? 1. Ambiguity in the language. This slide shows other difficulties. 2. What…
If you can analyze your email, what would you like to see? This is a question that keeps popping up in my head. Here are a few things I can think of: I want a knowledge base created from my…
I never really thought of MeetUps as a trend indicator. It suddenly dawned on me that it can be the leading indicator of activities in different areas. Here is an example. I got a notification email from meetup.com on a…
Here is a great story on how Shell plans to save 100s millions of dollars using Semantic Search. They estimate the savings to come from cutting the time of training their employees by providing right information based on the employee’s…
I am reading the UIMA overview document. It is a fascinating description of an architecture for analyzing unstructured documents. In analyzing unstructured content, UIMA based applications make use of a variety of analysis technologies including: • Statistical and rule-based Natural Language Processing…
In his book , Early Warning: Using Competitive Intelligence to Anticipate Market Shifts, Control Risk, and Create Powerful Strategies, Benjamin Gilad talks about causes of failure. Sticking to obsolete internal conviction even though the market evidence points otherwise, seems to be…