Text Parsing with Word Association.


Look at the title of this text. It is made up of five words. How often do we see these five words together in a body of text? How often is the word ‘text’ found in a sentence that also contains the word ‘association’?

The aim of this study is to draw some measurable conclusions about how often certain words are associated with each other in a text. One may ask, “who cares, why is word association important?” My aim is to provide some answers about exciting research in the study of the relationships between words in a text.


The brain: an incredible parsing machine


The association between words is something that most people simply take for granted. For instance, as you just read the preceding sentence, you probably did not have to think very deeply about the way that the words that make up that sentence are related to each other. You simply read the sentence. One of the amazing things about the human brain is it’s ability to string words from a sentence together to form concepts. Exactly how the brain takes care of this very complex task is still not completely known, but the amount of processing required to string together the words in a simple sentence is incredible when one closely examines what must be accomplished by the brain.


First, the brain has to store the words of the sentence. It has to retrieve the meaning of each word. Then it has to check if each word makes sense when put together with the other words, and finally it has to make sense of the collective meaning of the group of words. This may not be the exact process, or sequence, but it illustrates just a few of the processes that must take place before we understand a sentence. It is amazing how fast the brain accomplishes a task as seemingly complex as parsing a sentence for meaning.


If one thinks about the brain’s ability to associate one concept with another, it is apparent that this association is one of the most valuable resources on the earth. It is this concept association that allows us to learn and succeed in life. Evolutionists would say that it this ability that has allowed us to survive and thrive as long as we have, and the more developed this ability is in an entity, the higher on the evolutionary chain that entity becomes. Theists would say that it is this ability to create relationships between concepts that helps develop faith and knowledge. An all-knowing God would certainly posses a greater knowledge of the relationship between things in the universe than ordinary men.


If the reasons cited above aren’t convincing enough or are to ‘theoretical’, just imagine the power that an advertising or marketing company would have if they knew how their target audiences were associating one concept with another. Fro example, they could design campaigns to exploit those associations to lead their customers to associate their ‘product x’ with the concept of ‘good value’.


There has been an appreciable amount of research in the area of computerized text parsing for meaning and artificial intelligence.  Most of the research has high theoretical value, but the practical applications are few and far between. The goal of this project is to create a simple computerized application that parses text and measures the frequency with which one word is associated with another. The project combines a browser interface with a database to store texts and statistical data about word association gleaned from those texts. A user enters two words into the system. The system checks for those words in the text and calculates statistics about the association between those words.


Although the application is simple, it holds potential for greater future use and expansion. Imagine being able to record the words that are most often searched along with the path through the search results that the user chooses most often. Over time and with widespread use, such a system would eventually start to create connections between words and concepts, and in turn provide a highly intelligent searching and association engine that more closely mimics the brain’s ability to create synaptic connections between concepts. The implications for the field of artificial intelligence are astounding. Virtually any business or agency that deals with large amounts of information would benefit from improved smart searching and ‘connection-creation’. Expanded exponentially, such a system could theoretically become smart enough to ‘think’ like we do. Imagine a worldwide network of concepts supported by incredible processing power that drives itself, and the concept of an uber-techno-brain is not as far-fetched as it once seemed. The possibilities are as endless as the human brain’s abilities.