python - What is the approach to generate Topics from text using a wikipedia dump -
i'm new nlp/text processing
and building application requires generating topics (music, games, romance, history etc etc.) 2 lines of imput text.
i've decided use wikipedia's articlebase me out in process,
what steps "train" program recognize , categorize these topics input text?
such broad question. automated topic modeling (where don't have train model) might want @ latent dirichlet allocation. in python, gensim nice way lda. i've used weka in java classification tasks, might more you're looking at. , lightside researcher's work bench offers gui text mining tasks.