repo-autoindex

mirror of https://github.com/release-engineering/repo-autoindex.git synced 2025-02-23 13:42:52 +00:00

Author	SHA1	Message	Date
Rohan McGovern	9737288e6b	Fix tests for aiohttp 3.10 aiohttp 3.10 changed the server's handling of content-type/encoding headers for compressed files. This affects our usage of aiohttp.test_utils.TestServer: gzip-compressed files are no longer being decompressed by default. This in itself is not a problem, but the MIME type used ("application/gzip") did not match our client's list of automatically handled MIME types. Add it to the list to handle it in the same way as others. This fixes test_cmd::yum with aiohttp=>3.10.	2024-08-05 10:27:28 +10:00
Rohan McGovern	97a28fb7b1	Ensure directories appear first in listings [RHELDST-21890] Directories are generally expected to be listed first in directory indexes. That was already working for yum and file repos, but wasn't the case for kickstart repos due to their combination of different types of content. This commit applies a consistent sorting so that directories will always come first, and entries will otherwise be sorted by name, for all repo types.	2024-01-12 08:48:42 +10:00
Rohan McGovern	137388d475	Avoid spurious mypy failure in latest xml type hints startElement signature changed in the .pyi stubs for XML classes, triggering a mypy complaint here. Suppress it as there is no actual error here.	2024-01-12 08:48:42 +10:00
Rohan McGovern	eac74ec1e4	Further reduce memory usage on large yum repos [RHELDST-20453] The Fetcher type was designed to return a 'str'. That wasn't a good idea because it implies that every fetched file must be loaded into memory completely. On certain large yum repos, decompressed primary XML can be hundreds of MB, and it's not appropriate to require loading that all into memory at once. Make it support a file-like object (stream of bytes). Since the SAX XML parser supports reading from a stream, this makes it possible to avoid loading everything into memory at once. A test of repo-autoindex CLI against /content/dist/rhel/server/7/7Server/x86_64/os showed major improvement: - before: ~1200MiB - after: ~80MiB Note that achieving the full improvement requires any downstream users of the library (e.g. exodus-gw) to update their Fetcher implementation as well, to stop returning a 'str'.	2023-09-21 11:05:21 +10:00
Rohan McGovern	efb595d624	Add py.typed for PEP 561 This library includes inline type hints, but per PEP 561 this must be indicated by including a "py.typed" marker file, otherwise tools like mypy will not make use of the type hints when checking downstream projects.	2023-09-18 13:28:02 +10:00
Rohan McGovern	6ffbe4736c	Update and fix mypy with latest type hints again `3f478e76f7` added a "type: ignore" here due to a change in typeshed. The commit message mentioned that the type hint may have been wrong. It looks like that was fixed in https://github.com/python/typeshed/pull/9919/files, so it's necessary to also remove the "type: ignore" now.	2023-05-15 08:45:01 +10:00
Rohan McGovern	3f478e76f7	Fix type check with latest mypy/typeshed The following commit defined a return type hint for getElementsByTagName: `3fc2f27990 (diff-f451f731d037ef9d79347194490b32ba613798ea7eaa2c160351a69625f05e08R150)` It defined the return type as a list of Node, while this code expects a list of Element (Element is a subtype of Node). Given that one would expect a getElements method to return specifically elements and not other types of node, I think the typeshed change may be incorrect, but it's hard to be sure since the stdlib docs themselves are ambiguous. Suppress it for now to unblock dependency updates.	2023-04-19 13:57:11 +10:00
Caleigh Runge-Hottman	6a58f13963	Add support for AppStream kickstart repos [RHELDST-14528] AppStream kickstart repos were missing from the initial collection of repos used to test the kickstart repo index functionality. AppStream repos uniquely do not contain "checksums" sections in their treeinfo files. So, when attempting to run repo-autoindex against an AppStream kickstart repo, "KeyError: 'checksums'" was raised. Now, when encountering an AppStream kickstart repo, repo-autoindex does not attempt to parse the "checksums" section.	2023-04-06 10:17:21 -04:00
Caleigh Runge-Hottman	b284ddbd9f	Generate kickstart repo index [RHELDST-14528] Due to the presence of a "repodata/repomd.xml" path in a kickstart repo, repo-autoindex previously interpreted kickstart repos as yum repos. As such, a kickstart repo's index would solely consist of two directories: "Packages" and "repodata". While a kickstart repo does contain a yum repo, kickstart repos also contain two additional repo entry points: treeinfo and extra_files.json. Each entry point references additional files that should be included in a kickstart repo's index. These files were previously ignored. Now, when repo-autoindex encounters a kickstart repo, repo-autoindex produces a repo index that reflects the content referenced in all three repo entry points (repomd.xml, treeinfo, extra_files.json).	2023-03-31 17:53:57 -04:00
Rohan McGovern	117cabb0b7	Use SAX instead of pulldom for primary.xml parsing [RHELDST-14338] Redo the parsing of packages from primary.xml to use SAX; previously it was using pulldom. The motivation for the change is to reduce memory usage. When parsing a larger yum repo such as that contained within rhel-8-for-ppc64le-appstream-kickstart__8_DOT_4, the observed memory usage from repo-autoindex command was: - pulldom: ~700MB - SAX: ~85MB This does not affect the output of the indexing process, and is covered by existing tests.	2022-10-20 09:51:30 +10:00
Rohan McGovern	af99a34e39	Fix 'error: Unused "type: ignore" comment' with mypy 0.981	2022-10-05 10:38:32 +10:00
Rohan McGovern	293f5887b7	Implement error handling Ultimately, all errors are propagated in some way, but it's important to differentiate between "the content was invalid" vs "failed to fetch the content".	2022-08-09 08:51:06 +10:00
Rohan McGovern	90b746caee	Add PULP_MANIFEST support	2022-08-08 13:45:50 +10:00
Rohan McGovern	254e0f7cd9	Add some documentation	2022-08-08 12:46:07 +10:00
Rohan McGovern	52ec5f195b	Add a test running entire command	2022-08-02 16:45:24 +10:00
Rohan McGovern	c4fd70e240	Add a test for text elision	2022-08-02 16:41:46 +10:00
Rohan McGovern	fbd3cb37ec	Add a test which renders a yum repo	2022-07-07 14:04:18 +10:00
Rohan McGovern	787ba01a0e	Rearrange sources to keep API separate, add a real test	2022-07-07 13:20:43 +10:00
Rohan McGovern	3a0285d433	Add github workflows for testing	2022-06-29 16:25:51 +10:00
Rohan McGovern	119f0ea9b6	Make it all pass mypy	2022-06-29 16:19:05 +10:00
Rohan McGovern	05c6bd14be	Add some basic tests & CI setup	2022-06-29 15:52:00 +10:00
Rohan McGovern	8d5a624f18	Tweak default time/size for nonexistent data	2022-06-29 15:44:46 +10:00
Rohan McGovern	3062bdad26	Declare HTML lang and charset	2022-06-20 09:47:09 +10:00
Rohan McGovern	975e9f9f9e	Declare HTML doctype This is important due to certain behavior of localstack S3.	2022-06-20 09:33:52 +10:00
Rohan McGovern	5c4e5354b2	Initial implementation Basically working for yum repos.	2022-06-17 10:31:31 +10:00

25 commits