Skip to content

gh-132983: Reduce the size of _zstdmodule.h #133793

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

AA-Turner
Copy link
Member

@AA-Turner AA-Turner commented May 10, 2025

We should try and reduce the content in the header file, as it is exposed to anything that includes it.

  • mt_continue_should_break() moves into compressor.c
  • The decompress_type enum moves into decompressor.c
  • Replace #pragma once (not used for any other modules or non-vendored code) with guard defines
  • Use angle-bracket includes for z{std,dict}.h
  • Add a zstddict.h header file to contain the definition of the ZstdDict struct
  • Move the ZstdCompressor struct to compressor.c (likewise for decompressor)
  • Make PyModuleDef _zstdmodule static

A

@@ -149,12 +47,6 @@ typedef enum {
DICT_TYPE_PREFIX = 2
} dictionary_type;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this should move to zstddict.h?

Copy link
Member Author

@AA-Turner AA-Turner May 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd support moving it, but DICT_TYPE_DIGESTED is used by all three C files, so there's no easy way if I understand correctly. My main goal here was to remove the #include <zstd.h> from the file, though, which is done.

@hugovk hugovk changed the title GH-132983: Reduce the size of _zstdmodule.h May 10, 2025
@hugovk
Copy link
Member

hugovk commented May 10, 2025

Nit note about PR titles, the convention is lowercase gh-NNN for issues, and uppercase GH-NNN for PRs:

please make sure to reference the issue number using gh-NNNNN: prefix in the pull request title

https://devguide.python.org/getting-started/pull-request-lifecycle/#submitting

[3.9] gh-12345: Fix the Spam Module (GH-NNNN)

Here gh-12345 is the GitHub issue number, and GH-NNNN is the number of the original pull request.

https://devguide.python.org/core-developers/committing/#backporting-changes-to-an-older-version

@@ -7,7 +7,13 @@ Python module.
# define Py_BUILD_CORE_MODULE 1
#endif

#include "Python.h"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I never remember whether we explicitly include Python.h or not, considering _zstdmodule.h is already including it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Me neither, I'll check to see what's most common

/*
Low level interface to Meta's zstd library for use in the compression.zstd
Python module.
*/

/* Declarations shared between different parts of the _zstd module*/

#ifndef ZSTD_MODULE_H
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest having a ZSTD_ZSTDMODULE_H so to mimick the file path. This allows for a possible future Modules/_zstdmodule.h file

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could potentially also rename to _module.{ch} similar to sqlite? I think the guard define isn't public API here so we can change it if we want, but I'm not fully sure.

Copy link
Member

@picnixz picnixz May 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have _io/_iomodule.[ch] and _ctypes/ctypes.h I think it's better to keep the full name because it helps when looking for the file. And yes, AFAIK the guard is not public API, so up to you

@@ -164,4 +56,4 @@ extern void
set_parameter_error(const _zstd_state* const state, int is_compress,
int key_v, int value_v);

static const char init_twice_msg[] = "__init__ method is called twice.";
#endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#endif
#endif // !ZSTD_MODULE_H
@@ -3,9 +3,13 @@ Low level interface to Meta's zstd library for use in the compression.zstd
Python module.
*/

#include "_zstdmodule.h"
#ifndef ZSTD_BUFFER_H
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto for the naming + endif comments

@@ -7,24 +7,50 @@ Python module.

/*[clinic input]
module _zstd
class _zstd.ZstdCompressor "ZstdCompressor *" "clinic_state()->ZstdCompressor_type"
class _zstd.ZstdCompressor "ZstdCompressor *" "zstd_compressor_type_spec"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I never used a type spec here as I've only seen clinic_state()->TYPE constructions so I don't know if it's correct. However, if you want to keep this, use &zstd_compressor_type_spec not zstd_compressor_type_spec as you did for the other types.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops re &! There are other examples of _spec being used, so I think it's ok.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment