使用Cortex把PyTorch模型部署到生產中

阿新 • • 發佈：2020-10-21

cortexlabs/cortex

Watch173
6.7k
496

master 26branches 36tags Go to fileAdd fileCode vishalbolluUpdate batch api sample images (

1,557commits

Type Name Latest commit message Commit time Override the number of parallel jobs for the CI machine (#1421 ) 15 days ago Update docs links (#1095 ) 5 months ago Upload zipped cli to S3 (#1457 ) 4 days ago Change default starting local port for 8888 to 8890 (#1456 ) yesterday Add comment to load.go yesterday Update ECR login commands in system packages docs (#1455 ) yesterday Update batch api sample images (#1469 ) 15 hours ago Upgrade istio (#1422 ) 13 days ago Misc cleanup 6 days ago Ensure log group is created when multiple prefixes exist for the Batc… yesterday Initial commit 2 years ago Update tutorial links 2 months ago Add support for Inferentia ASICs (#1119 ) 4 months ago Update domain name 2 years ago Update CONTRIBUTING.md 12 months ago Add format and lint to Makefile and CI (#23 ) 2 years ago Parallelize make registry-*, ci-build-images and ci-push-images comma… 15 days ago Update README.md (#1416 ) 7 days ago Disable prompts in get-cli.sh if not running interactively (#1372 ) 28 days ago Upgrade istio (#1422 ) 13 days ago Upgrade istio (#1422 ) 13 days ago

README.md

install•documentation•examples•we're hiring•chat with us

Model serving at scale

Deploy

Deploy TensorFlow, PyTorch, ONNX, scikit-learn, and other models.
Define preprocessing and postprocessing steps in Python.
Configure APIs as realtime or batch.
Deploy multiple models per API.

Manage

Monitor API performance and track predictions.
Update APIs with no downtime.
Stream logs from APIs.
Perform A/B tests.

Scale

Test locally, scale on your AWS account.
Autoscale to handle production traffic.
Reduce cost with spot instances.

How it works

Write APIs in Python

Define any real-time or batch inference pipeline as simple Python APIs, regardless of framework.

# predictor.py

from transformers import pipeline

class PythonPredictor:
  def __init__(self, config):
    self.model = pipeline(task="text-generation")

  def predict(self, payload):
    return self.model(payload["text"])[0]

Configure infrastructure in YAML

Configure autoscaling, monitoring, compute resources, update strategies, and more.

# cortex.yaml

- name: text-generator
  predictor:
    path: predictor.py
  networking:
    api_gateway: public
  compute:
    gpu: 1
  autoscaling:
    min_replicas: 3

Scale to handle production traffic

Handle traffic with request-based autoscaling. Minimize spend with spot instances and multi-model APIs.

$ cortex get text-generator

endpoint: https://example.com/text-generator

status   last-update   replicas   requests   latency
live     10h           10         100000     100ms

Integrate with your stack

Integrate Cortex with any data science platform and CI/CD tooling, without changing your workflow.

# predictor.py

import tensorflow
import torch
import transformers
import mlflow

...

Run on your AWS account

Run Cortex on your AWS account (GCP support is coming soon), maintaining control over resource utilization and data access.

# cluster.yaml

region: us-west-2
instance_type: g4dn.xlarge
spot: true
min_instances: 1
max_instances: 5

Focus on machine learning, not DevOps

You don't need to bring your own cluster or containerize your models, Cortex automates your cloud infrastructure.

$ cortex cluster up

confguring networking ...
configuring logging ...
configuring metrics ...
configuring autoscaling ...

cortex is ready!

Get started

bash -c "$(curl -sS https://raw.githubusercontent.com/cortexlabs/cortex/0.20/get-cli.sh)"

See ourinstallation guide, then deploy one of ourexamplesor bring your own models to buildrealtime APIsandbatch APIs.

About

Deploy machine learning in production

cortex.dev Readme Apache-2.0 License

Releases36

v0.20.0Latest 21 days ago + 35 releases

Contributors16

+ 5 contributors

Languages

使用Cortex把PyTorch模型部署到生產中

Skip to content PullrequestsIssues Marketplace Explore cortexlabs/cortex Watch173 Unstar6.7k Fork496 Code Issues178

TensorFlow與PyTorch模型部署效能比較

TensorFlow與PyTorch模型部署效能比較前言 2022了，選 PyTorch 還是 TensorFlow？之前有一種說法：TensorFlow 適合業界，PyTorch 適合學界。這種說法到 2022 年還成立嗎？從模型可用性、部署便捷度和生態系統三個方

把vgg-face.mat權重遷移到pytorch模型示例

最近使用pytorch時，需要用到一個預訓練好的人臉識別模型提取人臉ID特徵，想到很多人都在用用vgg-face，但是vgg-face沒有pytorch的模型，於是寫個vgg-face.mat轉到pytorch模型的程式碼

pytorch轉onnx驗證_端側部署好助手：pytorch 模型轉 onnx，並驗證結果

技術標籤：pytorch轉onnx驗證 ONNX(Open Neural Network Exchange)是一種針對機器學習所設計的開放式的檔案格式，用於儲存訓練好的模型。它使得不同的人工智慧框架(如Pytorch、MXNet)可以採用相同格式儲存模型

pytorch模型轉trt部署

pytorch 轉onnx 首先載入pytorch模型 # load model import torch def load_model(ckpt) # build model model = build_model()# depending on your own model build function

pytorch模型轉NCNN模型在手機部署

記錄一下自己的轉換過程，中間也踩了很多坑。。第一步， pytorch轉onnx 這一步比較方便，pyroch自身就支援，注意input和output_names一定要填寫正確。

基於idea把springboot專案部署到docker

這篇文章主要介紹了基於idea把springboot專案部署到docker,文中通過示例程式碼介紹的非常詳細，對大家的學習或者工作具有一定的參考學習價值,需要的朋友可以參考下

pytorch模型預測結果與ndarray互轉方式

預測結果轉為numpy： logits=model(feature) #如果模型是跑在GPU上 result=logits.data.cpu().numpy()/logits.cpu().numpy()

Pytorch模型轉onnx模型例項

如下所示： import io import torch import torch.onnx from models.C3AEModel import PlainC3AENetCBAM device = torch.device(\"cuda:0\" if torch.cuda.is_available() else \"cpu\")