[WIP] Re-implementing Restore to use Pydantic by ormsbee · Pull Request #554 · openedx/openedx-core

ormsbee · 2026-04-20T04:06:33Z

This code will import things through the existing Studio interface, but it doesn't report errors properly. It's a parallel implementation (most of the original restore code still exists so that I can copy tests and compare implementation logic).

If you want to test on the command line, the new (temporary) command lp_load2 will work. You'd invoke it like this:

python manage.py cms lp_load2 [path] [package_ref] [username]

Starting point for all the logic is in backup_restore.api.load_learning_package.

New modules:

payload.py is where the extraction of TOML files happens, and things get assembled into a big dict for validation. It would also eventually be used to map the other way. It understands the backup archive's internal format.
validation.py is where that's gets turned into a validated Pydantic model and errors will be accumulated (might just fold this into api.py).
schema.py defines the actual Pydantic models. It needs a lot of validation rules (just the bare field stuff exists now).
loading.py is the logic for actually loading validated input into the database. It initializes a Loader object with input data and then calls loader.load_into() to push that data into a target LearningPackage.

I have a lot of tests to write and some details to work out regarding error reporting, but I think I'm comfortable with this as the basic structure. I intend to keep full compatibility with the existing REST endpoint and just do conversions as needed.

This is a re-implementation of the restore part of backup_restore, with the goal of making it more robust and maintainable in the long term.

ormsbee · 2026-04-20T18:03:20Z

@bradenmacdonald, @kdmccormick: The WIP (but functioning) new restore code. There are a lot of missing low-level things and tests, so it's not really ready for review. But please let me know if you have any high level concerns with the overall approach. Thank you.

ormsbee · 2026-04-20T20:15:04Z

+            for static_file_path in fs.glob(f"{comp_ver_dir}/static/**"):
+                if fs.isfile(static_file_path):
+                    rel_path = os.path.relpath(static_file_path, comp_ver_dir)
+                    media[rel_path] = f"fs:{static_file_path}"


@kdmccormick, @bradenmacdonald: I was worried about storing potentially enormous inline assets in a giant JSON doc, so I'm thinking of building the component version media mapping with this convention:

"text:" will be simple text

"base64:" indicates base64 encoded data is to follow

"fs:" indicates that what follows is a path from the filesystem object we're passing around from the input.

So an example:

{ "block.xml": "text:<problem>(bunch of OLX here)</problem>", "static/figure1.webp": "fs:entities/xblock.v1/problem/intro-to-everything/component_versions/v2/static/figure1.webp", "static/logo.svg": "base64:SXMgc29tZW9uZSBjdXJpb3VzIGVub3VnaCB0byBhY3R1YWxseSBkZWNvZGUgdGhpcz8=" }

Sounds nice, but that won't be backwards compatible right?

It's also very similar to data URLs; I like the simplicity here but could be worth considering using standard data URLs since there are already lots of libraries for parsing and generating them. (Although the syntax is not as nice; you have these mandatory , and ; which look weird when omitting MIME types.)

{ "block.xml": "data:,<problem>(bunch of OLX here)</problem>", "static/logo.svg": "data:;base64,SXMgc29tZW9uZSBjdXJpb3VzIGVub3VnaCB0byBhY3R1YWxseSBkZWNvZGUgdGhpcz8=" }

Sounds nice, but that won't be backwards compatible right?

This would be for the currently internal format after things get extracted and pieced together, but before Pydantic model validation. So it wouldn't break backwards compatibility with anything.

It's also very similar to data URLs;

That occurred to me, but I had only ever used those for base64 encoding binary data. I hadn't realized you could use them for plain text in a clear, non-escaped way. Thank you! I'll do that I think. So if it starts with "data:", I'll parse it like a data URL, and if it's just a path, I'll assume it comes from the archive.

ormsbee force-pushed the backup-restore-pydantic branch from 2518655 to cdfb9db Compare April 20, 2026 17:49

feat: swtich LearningPackage restore to pydantic

22d7143

This is a re-implementation of the restore part of backup_restore, with the goal of making it more robust and maintainable in the long term.

ormsbee force-pushed the backup-restore-pydantic branch from cdfb9db to 22d7143 Compare April 20, 2026 17:50

ormsbee commented Apr 20, 2026

View reviewed changes

ormsbee added 2 commits April 20, 2026 17:13

refactor: break off entities extraction for easier testing

c940e02

temp: add a bunch of tests

944ea74

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Re-implementing Restore to use Pydantic#554

[WIP] Re-implementing Restore to use Pydantic#554
ormsbee wants to merge 3 commits intoopenedx:mainfrom
ormsbee:backup-restore-pydantic

ormsbee commented Apr 20, 2026 •

edited

Loading

Uh oh!

ormsbee commented Apr 20, 2026

Uh oh!

ormsbee Apr 20, 2026 •

edited

Loading

Uh oh!

bradenmacdonald Apr 20, 2026 •

edited

Loading

Uh oh!

ormsbee Apr 21, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ormsbee commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ormsbee commented Apr 20, 2026

Uh oh!

ormsbee Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bradenmacdonald Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ormsbee Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ormsbee commented Apr 20, 2026 •

edited

Loading

ormsbee Apr 20, 2026 •

edited

Loading

bradenmacdonald Apr 20, 2026 •

edited

Loading

ormsbee Apr 21, 2026 •

edited

Loading