Skip to content

Where are the docs, TopDocs? #91

@Alvant

Description

@Alvant

What is the matter

Seems like TopDocuments Viewer assigns each document to one topic only. Even if there are some other topics which are represented well in the document. (These topics won't have the document in the result view.)

How to reproduce

  1. Make a small dataset of, let's say, three documents.
  2. Create a topic model of, let's say, 50 topics. Fit it on the dataset.
  3. Create a TopDocumentsViewer. Get a view of the model's topics' documents.

Result

Some topics do not have any documents, even if the probabilities in the Theta matrix are high.

image

Screenshot_2023-04-24_20-13-02

where the view_model function is:

image

Expected result

There is a way to control which documents are considered as "top documents". For example, if several topics have high probabilities in a particular document, then maybe there should be an opportunity to put this document in the "top lists of documents" for all the aforementioned topics.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestwontfixThis will not be worked on

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions