Skip to content

Commit 7bbb19f

Browse files
authored
Remove partitioner/quantizer discussion from readme and link to docs (#10033)
To keep information consistent and in-sync, link to CoreML docs rather than discuss the partioner/quantizer.
1 parent f8d65e9 commit 7bbb19f

File tree

1 file changed

+1
-106
lines changed

1 file changed

+1
-106
lines changed

backends/apple/coreml/README.md

Lines changed: 1 addition & 106 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,7 @@
11
# ExecuTorch Core ML Delegate
22

3-
43
This subtree contains the Core ML Delegate implementation for ExecuTorch.
5-
Core ML is an optimized framework for running machine learning models on Apple devices. The delegate is the mechanism for leveraging the Core ML framework to accelerate operators when running on Apple devices.
4+
Core ML is an optimized framework for running machine learning models on Apple devices. The delegate is the mechanism for leveraging the Core ML framework to accelerate operators when running on Apple devices. To learn how to use the CoreML delegate, see the [documentation](https://github.com/pytorch/executorch/blob/main/docs/source/backends-coreml.md).
65

76
## Layout
87
- `compiler/` : Lowers a module to Core ML backend.
@@ -19,110 +18,6 @@ Core ML is an optimized framework for running machine learning models on Apple d
1918
- `workspace` : Xcode workspace for the runtime.
2019
- `third-party/`: External dependencies.
2120

22-
## Partition and Delegation
23-
24-
To delegate a Program to the **Core ML** backend, the client must call `to_backend` with the **CoreMLPartitioner**.
25-
26-
```python
27-
import torch
28-
import executorch.exir
29-
30-
from executorch.backends.apple.coreml.compiler import CoreMLBackend
31-
from executorch.backends.apple.coreml.partition import CoreMLPartitioner
32-
33-
class Model(torch.nn.Module):
34-
def __init__(self):
35-
super().__init__()
36-
37-
def forward(self, x):
38-
return torch.sin(x)
39-
40-
source_model = Model()
41-
example_inputs = (torch.ones(1), )
42-
43-
# Export the source model to Edge IR representation
44-
aten_program = torch.export.export(source_model, example_inputs)
45-
edge_program_manager = executorch.exir.to_edge(aten_program)
46-
47-
# Delegate to Core ML backend
48-
delegated_program_manager = edge_program_manager.to_backend(CoreMLPartitioner())
49-
50-
# Serialize delegated program
51-
executorch_program = delegated_program_manager.to_executorch()
52-
with open("model.pte", "wb") as f:
53-
f.write(executorch_program.buffer)
54-
```
55-
56-
The module will be fully or partially delegated to **Core ML**, depending on whether all or part of ops are supported by the **Core ML** backend. User may force skip certain ops by `CoreMLPartitioner(skip_ops_for_coreml_delegation=...)`
57-
58-
The `to_backend` implementation is a thin wrapper over [coremltools](https://apple.github.io/coremltools/docs-guides/), `coremltools` is responsible for converting an **ExportedProgram** to a **MLModel**. The converted **MLModel** data is saved, flattened, and returned as bytes to **ExecuTorch**.
59-
60-
## Quantization
61-
62-
To quantize a Program in a Core ML favored way, the client may utilize **CoreMLQuantizer**.
63-
64-
```python
65-
import torch
66-
import executorch.exir
67-
68-
from torch.export import export_for_training
69-
from torch.ao.quantization.quantize_pt2e import (
70-
convert_pt2e,
71-
prepare_pt2e,
72-
prepare_qat_pt2e,
73-
)
74-
75-
from executorch.backends.apple.coreml.quantizer import CoreMLQuantizer
76-
from coremltools.optimize.torch.quantization.quantization_config import (
77-
LinearQuantizerConfig,
78-
QuantizationScheme,
79-
)
80-
81-
class Model(torch.nn.Module):
82-
def __init__(self) -> None:
83-
super().__init__()
84-
self.conv = torch.nn.Conv2d(
85-
in_channels=3, out_channels=16, kernel_size=3, padding=1
86-
)
87-
self.relu = torch.nn.ReLU()
88-
89-
def forward(self, x: torch.Tensor) -> torch.Tensor:
90-
a = self.conv(x)
91-
return self.relu(a)
92-
93-
source_model = Model()
94-
example_inputs = (torch.randn((1, 3, 256, 256)), )
95-
96-
pre_autograd_aten_dialect = export_for_training(source_model, example_inputs).module()
97-
98-
quantization_config = LinearQuantizerConfig.from_dict(
99-
{
100-
"global_config": {
101-
"quantization_scheme": QuantizationScheme.symmetric,
102-
"activation_dtype": torch.quint8,
103-
"weight_dtype": torch.qint8,
104-
"weight_per_channel": True,
105-
}
106-
}
107-
)
108-
quantizer = CoreMLQuantizer(quantization_config)
109-
110-
# For post-training quantization, use `prepare_pt2e`
111-
# For quantization-aware trainin,g use `prepare_qat_pt2e`
112-
prepared_graph = prepare_pt2e(pre_autograd_aten_dialect, quantizer)
113-
114-
prepared_graph(*example_inputs)
115-
converted_graph = convert_pt2e(prepared_graph)
116-
```
117-
118-
The `converted_graph` is the quantized torch model, and can be delegated to **Core ML** similarly through **CoreMLPartitioner**
119-
120-
## Runtime
121-
122-
To execute a Core ML delegated program, the application must link to the `coremldelegate` library. Once linked there are no additional steps required, ExecuTorch when running the program would call the Core ML runtime to execute the Core ML delegated part of the program.
123-
124-
Please follow the instructions described in the [Core ML setup](/backends/apple/coreml/setup.md) to link the `coremldelegate` library.
125-
12621
## Help & Improvements
12722
If you have problems or questions or have suggestions for ways to make
12823
implementation and testing better, please create an issue on [github](https://www.github.com/pytorch/executorch/issues).

0 commit comments

Comments
 (0)