Skip to content
This repository was archived by the owner on Sep 12, 2024. It is now read-only.
This repository was archived by the owner on Sep 12, 2024. It is now read-only.

Submit execution context to TextCortex API #7

@osolmaz

Description

@osolmaz

Currently, requests to TextCortex API generate code independently for each cell. Without the context of the entire notebook, global variables, etc. the API returns disparate code, forcing the user to be overly specific about e.g. variable names in their prompts.

Ideally, the entire execution context, i.e.

  1. inputs of previously executed cells,
  2. code generated from prompts,
  3. outputs of previously executed cells,
  4. names of variables in the global namespace,
  5. values of variables in the global namespace

should all be submitted to the API in each request for the best possible generation.

Bandwidth is a bottleneck for code generated remotely, so the request payload would need to be pruned without losing too much of the context. Say it should not exceed a ballpark of 500kB.

Implementation

Fortunately, IPython caches inputs and outputs for each cell and stores them in hidden variables in the global namespace, which we can easily access to:

https://ipython.readthedocs.io/en/stable/interactive/reference.html#input-caching-system

For submitting to a remote API, history variables need to be pruned down to the aforementioned limit. Code generation performance is inversely proportional to the amount of discarded information, but we expect it to perform already pretty well with only (1), (2) and (4) from above.

  • Implement logic to pack

    • (1)
    • (2)
    • (3)
    • (4)
    • (5)
  • Create a schema to convert the dict into JSON

That JSON would then be included in the payload and processed by the API for each request.

Notes

The JSON schema is to be the same as Jupyter notebook format where code generation specific data are stored in cell metadata.

Future work

  • A more sophisticated pruning algorithm that processes and includes (3) and (5) in the payload

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requesthigh priorityIssues that need immediate attention

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions