Tensorflow Serving模型指向s3地址,Could not find base path?

2020-08-06 10:44:38 浏览数 (1)

之前有同学遇到一个问题,通过 Workload 配置一个 Serving 服务的时候,通过 model_config_file 这个选项来指定多个模型文件,配置文件大概长这个样子。

代码语言:javascript复制
➜  tmp cat model.config
model_config_list {
 
  config {
    name:'10062'
    base_path:'s3://xxx-ai/humanoid/10062'
    model_platform:'tensorflow'
  }
 
  config {
    name:'10075'
    base_path:'s3://xxx-ai/humanoid/10075'
    model_platform:'tensorflow'
  }
}

但是 Serving 服务进程启动的时候,报错了,错误信息是说 Could not find base path xxxxxx,意思是没找到 base path?

其实这里是因为配置文件里的 base path 配置可以发现,最后没有斜杠 /,在 S3 里,没有 / 会被当做是一个对象 object,而 Serving 关于读取 base path 模型的源码如下。从源码可以看到,Serving 会拿到 base path 之后去遍历这个目录下面的文件,而如果是 s3 文件的话,这个对象本身是不存在的,所以就会报错,正确的做法,只要在 base path 参数的最后,补上斜杠 / 即可,如 s3://xxx-ai/humanoid/10075/,而这个问题,当模型在本地文件系统是不存在的。

代码语言:javascript复制
// Like PollFileSystemForConfig(), but for a single servable.
Status PollFileSystemForServable(
    const FileSystemStoragePathSourceConfig::ServableToMonitor& servable,
    std::vector<ServableData<StoragePath>>* versions) {
  // First, determine whether the base path exists. This check guarantees that
  // we don't emit an empty aspired-versions list for a non-existent (or
  // transiently unavailable) base-path. (On some platforms, GetChildren()
  // returns an empty list instead of erring if the base path isn't found.)
  if (!Env::Default()->FileExists(servable.base_path()).ok()) {
    return errors::InvalidArgument("Could not find base path ",
                                   servable.base_path(), " for servable ",
                                   servable.servable_name());
  }
 
  // Retrieve a list of base-path children from the file system.
  std::vector<string> children;
  TF_RETURN_IF_ERROR(
      Env::Default()->GetChildren(servable.base_path(), &children));
 
  // GetChildren() returns all descendants instead for cloud storage like GCS.
  // In such case we should filter out all non-direct descendants.
  std::set<string> real_children;
  for (int i = 0; i < children.size();   i) {
    const string& child = children[i];
    real_children.insert(child.substr(0, child.find_first_of('/')));
  }
  children.clear();
  children.insert(children.begin(), real_children.begin(), real_children.end());
  const std::map<int64 /* version */, string /* child */> children_by_version =
      IndexChildrenByVersion(children);
 
  bool at_least_one_version_found = false;
  switch (servable.servable_version_policy().policy_choice_case()) {
    case FileSystemStoragePathSourceConfig::ServableVersionPolicy::
        POLICY_CHOICE_NOT_SET:
      TF_FALLTHROUGH_INTENDED;  // Default policy is kLatest.
    case FileSystemStoragePathSourceConfig::ServableVersionPolicy::kLatest:
      at_least_one_version_found =
          AspireLatestVersions(servable, children_by_version, versions);
      break;
    case FileSystemStoragePathSourceConfig::ServableVersionPolicy::kAll:
      at_least_one_version_found =
          AspireAllVersions(servable, children, versions);
      break;
    case FileSystemStoragePathSourceConfig::ServableVersionPolicy::kSpecific: {
      at_least_one_version_found =
          AspireSpecificVersions(servable, children_by_version, versions);
      break;
    }
    default:
      return errors::Internal("Unhandled servable version_policy: ",
                              servable.servable_version_policy().DebugString());
  }
 
  if (!at_least_one_version_found) {
    LOG(WARNING) << "No versions of servable " << servable.servable_name()
                 << " found under base path " << servable.base_path();
  }
 
  return Status::OK();
}

0 人点赞