- ROUGE - Automated text summarization tool 2010.11.05
- [용어정리] Maximum likelihood estimation & Corpus smoothing 2010.10.28
- Wikimedia Downloads 2009.06.29 (1)
- Linear Algebra 2009.05.12 (2)
- Stop words (English) - from Wikipedia 2008.01.16
- [e-book] Introduction to Information retrieval 2008.01.10
- [Paper] Information Retrieval 관련 논문 2008.01.03
간단하게 2개의 인자들로만 확률값의 maximum likelihood를 측정했지만, 여러상황을 고려해야 하므로 실제로 쓰이는 계산은 복잡하다. (참고 : http://en.wikipedia.org/wiki/Maximum_likelihood_method )
Stop words, or stopwords, is the name given to words which are filtered out prior to, or after, processing of natural language data (text).
Hans Peter Luhn, one of the pioneers in information retrieval, is credited with coining the phrase and using the concept in his design. It is controlled by human input and not automated. This is sometimes seen as a negative approach to the natural articles of speech as mentioned above.
There is no definite list of stop words which all natural language processing tools incorporate. Not all NLP tools use a stoplist. Some tools specifically avoid using them to support phrase searching. The use of a stemming algorithm may reduce part of the rationale or dependence on a stoplist to filter out words.
Stop words can cause problems when using a search engine to search for phrases that include them, particularly in names such as 'The Who' or 'Take That'.
간단히 말하자면 검색에서 제외되는 비중이 없는 단어들을 stop words 라고 한다.
검색 관련 공부 최고의 서적이라 생각됩니다.
인쇄하려면 print 버전을
온라인상에서 읽으려면 onlinereading 버전을 받으세요.
<Korea - Seoul University>