文本和图片生成向量的方式一般是通过已有的模型进行生成,在流行的模型托管平台上已有大量开源的Embedding模型,如国外的HuggingFace平台和国内的ModelScope平台。
这些模型托管平台一般会封装自己的SDK对模型的加载和推断进行流程简化,方便用户快速使用模型。接下来将对文本生成向量和图片生成向量在不同平台SDK下使用方式进行简单介绍。
文本生成向量
OpenAI(官方收费)
安装依赖。
代码语言:shell复制pip install -U openai
文本生成向量示例如下。
代码语言:bash复制from openai.embeddings_utils import get_embedding
def generate_embedding_by_openai(sentences):
embedding = get_embedding(sentences, engine="text-embedding-ada-002")
return embedding
HuggingFace
安装依赖。
代码语言:shell复制pip install -U transformers
文本生成向量示例如下。若本地缓存不存在该模型,默认会从HuggingFace上下载该模型到本地。
代码语言:python代码运行次数:0复制from transformers import AutoTokenizer, AutoModel
import torch
import torch.nn.functional as F
#Mean Pooling - Take attention mask into account for correct averaging
def mean_pooling(model_output, attention_mask):
token_embeddings = model_output[0] #First element of model_output contains all token embeddings
input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)
# Sentences we want sentence embeddings for
sentences = ['This is an example sentence', 'Each sentence is converted']
# Load model from HuggingFace Hub
tokenizer = AutoTokenizer.from_pretrained('sentence-transformers/all-MiniLM-L6-v2')
model = AutoModel.from_pretrained('sentence-transformers/all-MiniLM-L6-v2')
# Tokenize sentences
encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
# Compute token embeddings
with torch.no_grad():
model_output = model(**encoded_input)
# Perform pooling
sentence_embeddings = mean_pooling(model_output, encoded_input['attention_mask'])
# Normalize embeddings
sentence_embeddings = F.normalize(sentence_embeddings, p=2, dim=1)
print("Sentence embeddings:")
print(sentence_embeddings)
Sentence-Transformers
Sentence-Transformers专注在文本处理领域,其推出的大模型都具有较好的效果。
安装依赖。
代码语言:bash复制pip install -U sentence-transformers
文本生成向量示例如下。若本地缓存不存在该模型,默认会从HuggingFace上下载该模型到本地。
代码语言:python代码运行次数:0复制from sentence_transformers import SentenceTransformer
sentences = ["This is an example sentence", "Each sentence is converted"]
model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')
embeddings = model.encode(sentences)
print(embeddings)
ModelScope
在ModelScope框架上,可以通过简单的Pipeline调用来使用文本向量表示模型。ModelScope封装了统一的接口对外提供单句向量表示、双句文本相似度、多候选相似度计算功能。
安装依赖。
代码语言:shell复制pip install -U modelscope
文本生成向量示例如下。若本地缓存不存在该模型,默认会从ModelScope上下载该模型到本地。
代码语言:python代码运行次数:0复制from modelscope.models import Model
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
model_id = "damo/nlp_corom_sentence-embedding_chinese-base"
pipeline_se = pipeline(Tasks.sentence_embedding,
model=model_id)
# 当输入包含“soure_sentence”与“sentences_to_compare”时,会输出source_sentence中首个句子与sentences_to_compare中每个句子的向量表示,以及source_sentence中首个句子与sentences_to_compare中每个句子的相似度。
inputs = {
"source_sentence": ["吃完海鲜可以喝牛奶吗?"],
"sentences_to_compare": [
"不可以,早晨喝牛奶不科学",
"吃了海鲜后是不能再喝牛奶的,因为牛奶中含得有维生素C,如果海鲜喝牛奶一起服用会对人体造成一定的伤害",
"吃海鲜是不能同时喝牛奶吃水果,这个至少间隔6小时以上才可以。",
"吃海鲜是不可以吃柠檬的因为其中的维生素C会和海鲜中的矿物质形成砷"
]
}
result = pipeline_se(input=inputs)
print (result)
# {'text_embedding': array([[-0.2321947 , 0.41309452, 0.26903808, ..., -0.27691665,
# 0.39870635, 0.26265666],
# [-0.2236533 , 0.4202284 , 0.2666558 , ..., -0.26484373,
# 0.40744486, 0.27727932],
# [-0.25315344, 0.38203233, 0.24046004, ..., -0.32800043,
# 0.41472995, 0.29768184],
# [-0.24323441, 0.41074473, 0.24910843, ..., -0.30696338,
# 0.40286067, 0.2736369 ],
# [-0.25041905, 0.37499064, 0.24194787, ..., -0.31972343,
# 0.41340488, 0.27778068]], dtype=float32), 'scores': [70.26203918457031, 70.42508697509766, 70.55732727050781, 70.36207580566406]}
Towhee
安装依赖。
代码语言:shell复制pip install -U towhee
文本生成向量示例如下。
代码语言:python代码运行次数:0复制from towhee import AutoPipes
# get the built-in sentence_similarity pipeline
sentence_embedding = AutoPipes.pipeline('sentence_embedding')
# generate embedding for one sentence
embedding = sentence_embedding('how are you?').get()
# batch generate embeddings for multi-sentences
embeddings = sentence_embedding.batch(['how are you?', 'how old are you?'])
embeddings = [e.get() for e in embeddings]
参考
- https://huggingface.co/models?pipeline_tag=sentence-similarity&sort=downloads
- https://modelscope.cn/models?page=1&tasks=sentence-embedding&type=nlp
- https://towhee.io/tasks/detail/pipeline/sentence-similarity
图片生成向量
HuggingFace
安装依赖。
代码语言:shell复制pip install -U transformers
图片生成向量相关示例如下。此处调用的是OpenAI开源的CLIP模型。
示例1:图片生成向量。
代码语言:python代码运行次数:0复制from PIL import Image
import requests
from transformers import CLIPProcessor, CLIPModel
model = CLIPModel.from_pretrained("openai/clip-vit-large-patch14")
processor = CLIPProcessor.from_pretrained("openai/clip-vit-large-patch14")
url = "https://img.yuanmabao.com/zijie/pic/2024/03/13/xg1ylye3syn.jpg"
image = Image.open(requests.get(url, stream=True).raw)
with torch.no_grad():
inputs = processor(images=image, return_tensors="pt", padding=True)
image_features = model.get_image_features(inputs.pixel_values)
print('image_features:', image_features)
示例2:图片分类。
代码语言:python代码运行次数:0复制from PIL import Image
import requests
from transformers import CLIPProcessor, CLIPModel
model = CLIPModel.from_pretrained("openai/clip-vit-large-patch14")
processor = CLIPProcessor.from_pretrained("openai/clip-vit-large-patch14")
url = "https://img.yuanmabao.com/zijie/pic/2024/03/13/xg1ylye3syn.jpg"
image = Image.open(requests.get(url, stream=True).raw)
inputs = processor(text=["a photo of a cat", "a photo of a dog"], images=image, return_tensors="pt", padding=True)
outputs = model(**inputs)
logits_per_image = outputs.logits_per_image # this is the image-text similarity score
probs = logits_per_image.softmax(dim=1) # we can take the softmax to get the label probabilities
ModelScope
安装依赖。
代码语言:shell复制pip install -U modelscope
图片生成向量示例如下。
示例1:商品同款特征,https://modelscope.cn/models/damo/cv_resnet50_product-bag-embedding-models/summary
代码语言:python代码运行次数:0复制from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
product_embedding = pipeline(
Tasks.product_retrieval_embedding,
model='damo/cv_resnet50_product-bag-embedding-models')
result = product_embedding('https://img.yuanmabao.com/zijie/pic/2024/03/13/zrhbsgp2s1g.jpg')
示例2:行人图像特征表示提取,https://modelscope.cn/models/damo/cv_resnet50_product-bag-embedding-models/summary
代码语言:python代码运行次数:0复制from modelscope.outputs import OutputKeys
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
model_id = 'damo/cv_passvitb_image-reid-person_market'
input_location = 'https://img.yuanmabao.com/zijie/pic/2024/03/13/l0stxqgy2vk.jpg'
image_reid_person = pipeline(Tasks.image_reid_person, model=model_id)
result = image_reid_person(input_location)
print("result is : ", result[OutputKeys.IMG_EMBEDDING])
示例3:图片分类,https://modelscope.cn/models/Fengshenbang/Taiyi-CLIP-Roberta-large-326M-Chinese/summary
代码语言:python代码运行次数:0复制from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
if __name__ == '__main__':
model = "Fengshenbang/Taiyi-CLIP-Roberta-large-326M-Chinese"
pipe = pipeline(Tasks.text_classification, model=model, model_revision='v1.0.2')
instruction = {'query_texts': ["一只猫", "一只狗", '两只猫', '两只老虎', '一只老虎'],
'url': 'https://img.yuanmabao.com/zijie/pic/2024/03/13/xg1ylye3syn.jpg'}
output = pipe(instruction)
print(output)
# output:{'scores': 0.98992366, 'labels': '两只猫'}
Towhee
安装依赖。
代码语言:shell复制pip install -U towhee
图片生成向量示例如下。
代码语言:python代码运行次数:0复制from towhee import AutoPipes
# get the built-in text_image_embedding pipeline
image_pipe = AutoPipes.pipeline('text_image_embedding')
# generate image embedding
embedding = image_embedding('./test1.png').get()
# batch generate image embeddings
embeddings = image_embedding.batch(['./test1.png', './test2.png'])
embeddings = [e.get() for e in embeddings]
参考
- https://huggingface.co/openai/clip-vit-large-patch14
- https://towhee.io/tasks/detail/pipeline/text-image-search