Skip to content

Commit 21448c7

Browse files
authored
REST code generation for 2024-07 prerelease (#361)
## Problem We need to generate code off of the openapi specs stored in our apis repo. We'll need to massage the output a bit in order to avoid having a lot of confusing duplication of generated exception classes and other utils. ## Solution - I create a new `codegen` directory to act as the home for everything codegen related. Inside this directory there are several things: - Git submodule of our template files (slightly tweaked from the open source generator, but in a separate repo so they can be shared with plugins) - Git submodule of our apis repo - A `build-oas.sh` script which does everything to generate the code off the spec, adjust import paths, and place things into the `pinecone` module package structure. The script does the following: - Pulls in the latest from the apis repo - Builds the api repo to produce versioned spec files - For each of data plane and control plane, generates a python package. - Performs surgery to extract the duplicated files across these two packages into a `shared` package. - Use `sed` to adjust import paths to reflect the modified code structure. The surgery to extract shared code is more than just a cosmetic change, as without it you can end up with a lot of confusion in the areas of configuration and error handling due to there being multiple objects that despite being identical in name and functionality remain distinct from the perspective of language utils such as `isinstance`, `.__class__` and `except`. When the script is done running, the generated outputs have this structure: ```txt pinecone/core └── openapi ├── control │   ├── api │   │   ├── inference_api.py │   │   └── manage_indexes_api.py │   └── model │      ├── __init__.py │      ├── create_index_request.py │      ├── index_model.py │      └── ...etc ├── data │   ├── api │   │   └── data_plane_api.py │   └── model │      ├── __init__.py │      ├── upsert_request.py │      ├── vector.py │      └── ...etc └── shared ├── api_client.py ├── configuration.py ├── exceptions.py ├── model_utils.py └── rest.py ``` ## What actually changed in here? It seems the main substantive change is in how the `spec` info is expected to be passed when creating an index. So I had to write some additional tests and make some modifications to the `create_index` method. Also, inference-related stuff is currently generated. But for now, all the wrapper implementations for that functionality are only available by installing a separate plugin, [pinecone-plugin-inference](https://github.com/pinecone-io/python-plugin-inference). ## Type of Change - [x] Breaking change (fix or feature that would cause existing functionality to not work as expected) ## Test Plan To run the script, run `make generate-oas`. Before pushing, I ran tests locally with `make test-unit` and `PINECONE_API_KEY='key' make test-integration`. This helped me catch a lot of small issues related to the sed rewrite of import paths. Want to see tests passing even though I changed quite a bit about how the generated code is structured.
1 parent ab3ccfe commit 21448c7

File tree

106 files changed

+3268
-669
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

106 files changed

+3268
-669
lines changed

.gitmodules

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
[submodule "codegen/apis"]
2+
path = codegen/apis
3+
url = [email protected]:pinecone-io/apis.git
4+
[submodule "codegen/python-oas-templates"]
5+
path = codegen/python-oas-templates
6+
url = [email protected]:pinecone-io/python-oas-templates.git

Makefile

Lines changed: 7 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -13,13 +13,20 @@ test-unit:
1313
@echo "Running tests..."
1414
poetry run pytest --cov=pinecone --timeout=120 tests/unit
1515

16+
test-integration:
17+
@echo "Running integration tests..."
18+
PINECONE_ENVIRONMENT="us-east4-gcp" SPEC='{"serverless": {"cloud": "aws", "region": "us-east-1" }}' DIMENSION=2 METRIC='cosine' GITHUB_BUILD_NUMBER='local' poetry run pytest tests/integration
19+
1620
test-grpc-unit:
1721
@echo "Running tests..."
1822
poetry run pytest --cov=pinecone --timeout=120 tests/unit_grpc
1923

2024
make type-check:
2125
poetry run mypy pinecone --exclude pinecone/core
2226

27+
make generate-oas:
28+
./codegen/build-oas.sh "2024-07"
29+
2330
version:
2431
poetry version
2532

@@ -28,21 +35,3 @@ package:
2835

2936
upload:
3037
poetry publish --verbose --username ${PYPI_USERNAME} --password ${PYPI_PASSWORD}
31-
32-
upload-spruce:
33-
# Configure Poetry for publishing to testpypi
34-
poetry config repositories.test-pypi https://test.pypi.org/legacy/
35-
poetry publish --verbose -r test-pypi --username ${PYPI_USERNAME} --password ${PYPI_PASSWORD}
36-
37-
license:
38-
# Add license header using https://github.com/google/addlicense.
39-
# If the license header already exists in a file, re-running this command has no effect.
40-
pushd ${mkfile_path}/pinecone && \
41-
docker run --rm -it -v ${mkfile_path}/pinecone:/src ghcr.io/google/addlicense:latest -f ./license_header.txt *.py */*.py */*/*.py */*/*/*.py */*/*/*/*.py */*/*/*/*/*.py */*/*/*/*/*/*.py; \
42-
popd
43-
44-
set-production:
45-
echo "production" > pinecone/__environment__
46-
47-
set-development:
48-
echo "" > pinecone/__environment__

codegen/apis

Submodule apis added at fbd9d8d

codegen/build-oas.sh

Lines changed: 148 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,148 @@
1+
#!/bin/bash
2+
3+
set -eux -o pipefail
4+
5+
version=$1 # e.g. 2024-07
6+
modules=("control" "data")
7+
8+
destination="pinecone/core/openapi"
9+
build_dir="build"
10+
11+
update_apis_repo() {
12+
echo "Updating apis repo"
13+
pushd codegen/apis
14+
git fetch
15+
git pull
16+
just build
17+
popd
18+
}
19+
20+
update_templates_repo() {
21+
echo "Updating templates repo"
22+
pushd codegen/python-oas-templates
23+
git fetch
24+
git pull
25+
popd
26+
}
27+
28+
verify_spec_version() {
29+
local version=$1
30+
echo "Verifying spec version $version exists in apis repo"
31+
if [ -z "$version" ]; then
32+
echo "Version is required"
33+
exit 1
34+
fi
35+
36+
verify_directory_exists "codegen/apis/_build/${version}"
37+
}
38+
39+
verify_file_exists() {
40+
local filename=$1
41+
if [ ! -f "$filename" ]; then
42+
echo "File does not exist at $filename"
43+
exit 1
44+
fi
45+
}
46+
47+
verify_directory_exists() {
48+
local directory=$1
49+
if [ ! -d "$directory" ]; then
50+
echo "Directory does not exist at $directory"
51+
exit 1
52+
fi
53+
}
54+
55+
generate_client() {
56+
local module_name=$1
57+
58+
oas_file="codegen/apis/_build/${version}/${module_name}_${version}.oas.yaml"
59+
openapi_generator_config="codegen/openapi-config.${module_name}.json"
60+
template_dir="codegen/python-oas-templates/templates5.2.0"
61+
62+
verify_file_exists $oas_file
63+
verify_file_exists $openapi_generator_config
64+
verify_directory_exists $template_dir
65+
66+
# Cleanup previous build files
67+
echo "Cleaning up previous build files"
68+
rm -rf "${build_dir}"
69+
70+
# Generate client module
71+
docker run --rm -v $(pwd):/workspace openapitools/openapi-generator-cli:v5.2.0 generate \
72+
--input-spec "/workspace/$oas_file" \
73+
--generator-name python \
74+
--config "/workspace/$openapi_generator_config" \
75+
--output "/workspace/${build_dir}" \
76+
--template-dir "/workspace/$template_dir"
77+
78+
# Copy the generated module to the correct location
79+
rm -rf "${destination}/${module_name}"
80+
cp -r "build/pinecone/core/openapi/${module_name}" "${destination}/${module_name}"
81+
}
82+
83+
extract_shared_classes() {
84+
target_directory="${destination}/shared"
85+
mkdir -p "$target_directory"
86+
87+
# Define the list of shared source files
88+
sharedFiles=(
89+
"api_client"
90+
"configuration"
91+
"exceptions"
92+
"model_utils"
93+
"rest"
94+
)
95+
96+
source_directory="${destination}/${modules[0]}"
97+
98+
# Loop through each file we want to share and copy it to the target directory
99+
for file in "${sharedFiles[@]}"; do
100+
cp "${source_directory}/${file}.py" "$target_directory"
101+
done
102+
103+
# Cleanup shared files in each module
104+
for module in "${modules[@]}"; do
105+
source_directory="${destination}/${module}"
106+
for file in "${sharedFiles[@]}"; do
107+
rm "${source_directory}/${file}.py"
108+
done
109+
done
110+
111+
# Remove the docstring headers that aren't really correct in the
112+
# context of this new shared package structure
113+
find "$target_directory" -name "*.py" -print0 | xargs -0 -I {} sh -c 'sed -i "" "/^\"\"\"/,/^\"\"\"/d" "{}"'
114+
115+
echo "All shared files have been copied to $target_directory."
116+
117+
# Adjust import paths in every file
118+
find "${destination}" -name "*.py" | while IFS= read -r file; do
119+
sed -i '' 's/from \.\.model_utils/from pinecone\.core\.openapi\.shared\.model_utils/g' "$file"
120+
121+
for module in "${modules[@]}"; do
122+
sed -i '' "s/from pinecone\.core\.openapi\.$module import rest/from pinecone\.core\.openapi\.shared import rest/g" "$file"
123+
124+
for sharedFile in "${sharedFiles[@]}"; do
125+
sed -i '' "s/from pinecone\.core\.openapi\.$module\.$sharedFile/from pinecone\.core\.openapi\.shared\.$sharedFile/g" "$file"
126+
done
127+
done
128+
done
129+
}
130+
131+
update_apis_repo
132+
update_templates_repo
133+
verify_spec_version $version
134+
135+
rm -rf "${destination}"
136+
mkdir -p "${destination}"
137+
138+
for module in "${modules[@]}"; do
139+
generate_client $module
140+
done
141+
142+
# Even though we want to generate multiple packages, we
143+
# don't want to duplicate every exception and utility class.
144+
# So we do a bit of surgery to combine the shared files.
145+
extract_shared_classes
146+
147+
# Format generated files
148+
poetry run black "${destination}"

codegen/openapi-config.control.json

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
{
2+
"packageName": "pinecone.core.openapi.control",
3+
"pythonAttrNoneIfUnset": true
4+
}

codegen/openapi-config.data.json

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
{
2+
"packageName": "pinecone.core.openapi.data",
3+
"pythonAttrNoneIfUnset": true
4+
}

codegen/python-oas-templates

Submodule python-oas-templates added at b72bd5b

pinecone/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414
from .data import *
1515
from .models import *
1616

17-
from .core.client.models import (
17+
from .core.openapi.control.models import (
1818
IndexModel,
1919
)
2020

pinecone/config/config.py

Lines changed: 20 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,11 @@
11
from typing import NamedTuple, Optional, Dict
22
import os
33

4-
from pinecone.exceptions import PineconeConfigurationError
4+
from pinecone.exceptions.exceptions import PineconeConfigurationError
55
from pinecone.config.openapi import OpenApiConfigFactory
6-
from pinecone.core.client.configuration import Configuration as OpenApiConfiguration
6+
from pinecone.core.openapi.shared.configuration import (
7+
Configuration as OpenApiConfiguration,
8+
)
79
from pinecone.utils import normalize_host
810
from pinecone.utils.constants import SOURCE_TAG
911

@@ -57,15 +59,28 @@ def build(
5759
if not host:
5860
raise PineconeConfigurationError("You haven't specified a host.")
5961

60-
return Config(api_key, host, proxy_url, proxy_headers, ssl_ca_certs, ssl_verify, additional_headers, source_tag)
62+
return Config(
63+
api_key,
64+
host,
65+
proxy_url,
66+
proxy_headers,
67+
ssl_ca_certs,
68+
ssl_verify,
69+
additional_headers,
70+
source_tag,
71+
)
6172

6273
@staticmethod
6374
def build_openapi_config(
64-
config: Config, openapi_config: Optional[OpenApiConfiguration] = None, **kwargs
75+
config: Config,
76+
openapi_config: Optional[OpenApiConfiguration] = None,
77+
**kwargs,
6578
) -> OpenApiConfiguration:
6679
if openapi_config:
6780
openapi_config = OpenApiConfigFactory.copy(
68-
openapi_config=openapi_config, api_key=config.api_key, host=config.host
81+
openapi_config=openapi_config,
82+
api_key=config.api_key,
83+
host=config.host,
6984
)
7085
elif openapi_config is None:
7186
openapi_config = OpenApiConfigFactory.build(api_key=config.api_key, host=config.host)

pinecone/config/openapi.py

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,9 @@
77

88
from urllib3.connection import HTTPConnection
99

10-
from pinecone.core.client.configuration import Configuration as OpenApiConfiguration
10+
from pinecone.core.openapi.shared.configuration import (
11+
Configuration as OpenApiConfiguration,
12+
)
1113

1214
TCP_KEEPINTVL = 60 # Sec
1315
TCP_KEEPIDLE = 300 # Sec
@@ -84,7 +86,13 @@ def _get_socket_options(
8486
and hasattr(socket, "TCP_KEEPCNT")
8587
):
8688
socket_params += [(socket.IPPROTO_TCP, socket.TCP_KEEPIDLE, keep_alive_idle_sec)]
87-
socket_params += [(socket.IPPROTO_TCP, socket.TCP_KEEPINTVL, keep_alive_interval_sec)]
89+
socket_params += [
90+
(
91+
socket.IPPROTO_TCP,
92+
socket.TCP_KEEPINTVL,
93+
keep_alive_interval_sec,
94+
)
95+
]
8896
socket_params += [(socket.IPPROTO_TCP, socket.TCP_KEEPCNT, keep_alive_tries)]
8997

9098
# TCP Keep Alive Probes for Windows OS

0 commit comments

Comments
 (0)