一、背景
1、连接hive不是真正的hive,而是由kyuubi spark并接入hive元数据库搭建的,用来替代运行效率慢的hive且也可以提供jdbc连接
2、使用的superset docker镜像是apache官方的镜像,Dockerfile如下:
代码语言:javascript复制FROM apache/superset:latest-dev
USER 0
RUN apt-get update
&& apt-get install -y procps vim net-tools iputils-ping
3、k8s部署的使用的yaml,这里改了一下superset的端口,在配置文件(superset_config.py)里面设置SUPERSET_WEBSERVER_PORT=8888没有生效,通过查看启动脚本,只能通过设置环境变量SUPERSET_PORT=8888改。查看官方给的docker部署superset的文档,superset_config放的位置是/app/pythonpath/superset_config.py
代码语言:javascript复制apiVersion: v1
kind: Service
metadata:
name: superset-service
namespace: kyuubi
labels:
run: superset-service
spec:
ports:
- port: 80
protocol: TCP
name: http
- port: 8888
protocol: TCP
name: hue
selector:
run: superset-service
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: superset-service
namespace: kyuubi
spec:
selector:
matchLabels:
run: superset-service
replicas: 1
template:
metadata:
labels:
run: superset-service
containers:
- name: superset-service
image: dockerhub/mysoft:superset-v1.5
env:
- name: SUPERSET_PORT
value: "8888"
volumeMounts:
- name: superset-config
mountPath: /app/pythonpath/superset_config.py
subPath: superset_config.py
volumes:
- name: superset-config
configMap:
name: superset-config
4、配置是通过k8s的configMap覆盖进去的,只改了个数据库,yaml如下
代码语言:javascript复制apiVersion: v1
kind: ConfigMap
metadata:
name: superset-config
namespace: kyuubi
data:
superset_config.py: |
SQLALCHEMY_DATABASE_URI = 'mysql://name:password@dbname/superset'
容器起来后,在容器里面初始化superset,附上需要的命令
superset db upgrade:初始化数据库,建表
superset init:初始化superset
superset fab create-admin:创建superset管理员用户
二、问题
通过database connection创建了hive连接,创建hive连接的时候会报错,但是创建成功,记录也有,使用也正常
打开hive连接详情报错
三、解决
查看容器日志
可以看出是flask框架的json的dumps的时候报错了,而且通过调用栈发现,它的dumps后面实际是调用python的json的dumps,查看/usr/local/lib/python3.8/site-packages/flask/json/__init__.py,发现它里面自己定义了一个JSONEncoder,坑在它里面没有对bytes这种类型做处理
修改一下,添加把bytes类型转为string
在通过configmap和容器挂载设置进入,修改后的yaml如下:
configmap.yaml
代码语言:javascript复制apiVersion: v1
kind: ConfigMap
metadata:
name: superset-config
namespace: kyuubi
data:
superset_config.py: |
SQLALCHEMY_DATABASE_URI = 'mysql://name:password@host/superset'
__init__.py: |
import decimal
import io
import json as _json
import typing as t
import uuid
import warnings
from datetime import date
from jinja2.utils import htmlsafe_json_dumps as _jinja_htmlsafe_dumps
from werkzeug.http import http_date
from ..globals import current_app
from ..globals import request
if t.TYPE_CHECKING:
from ..app import Flask
from ..wrappers import Response
try:
import dataclasses
except ImportError:
# Python < 3.7
dataclasses = None # type: ignore
class JSONEncoder(_json.JSONEncoder):
"""The default JSON encoder. Handles extra types compared to the
built-in :class:`json.JSONEncoder`.
- :class:`datetime.datetime` and :class:`datetime.date` are
serialized to :rfc:`822` strings. This is the same as the HTTP
date format.
- :class:`uuid.UUID` is serialized to a string.
- :class:`dataclasses.dataclass` is passed to
:func:`dataclasses.asdict`.
- :class:`~markupsafe.Markup` (or any object with a ``__html__``
method) will call the ``__html__`` method to get a string.
Assign a subclass of this to :attr:`flask.Flask.json_encoder` or
:attr:`flask.Blueprint.json_encoder` to override the default.
"""
def default(self, o: t.Any) -> t.Any:
"""Convert ``o`` to a JSON serializable type. See
:meth:`json.JSONEncoder.default`. Python does not support
overriding how basic types like ``str`` or ``list`` are
serialized, they are handled before this method.
"""
if isinstance(o, date):
return http_date(o)
if isinstance(o, (decimal.Decimal, uuid.UUID)):
return str(o)
if isinstance(o, bytes):
return str(o, encoding='utf-8');
if dataclasses and dataclasses.is_dataclass(o):
return dataclasses.asdict(o)
if hasattr(o, "__html__"):
return str(o.__html__())
return super().default(o)
class JSONDecoder(_json.JSONDecoder):
"""The default JSON decoder.
This does not change any behavior from the built-in
:class:`json.JSONDecoder`.
Assign a subclass of this to :attr:`flask.Flask.json_decoder` or
:attr:`flask.Blueprint.json_decoder` to override the default.
"""
def _dump_arg_defaults(
kwargs: t.Dict[str, t.Any], app: t.Optional["Flask"] = None
) -> None:
"""Inject default arguments for dump functions."""
if app is None:
app = current_app
if app:
cls = app.json_encoder
bp = app.blueprints.get(request.blueprint) if request else None # type: ignore
if bp is not None and bp.json_encoder is not None:
cls = bp.json_encoder
# Only set a custom encoder if it has custom behavior. This is
# faster on PyPy.
if cls is not _json.JSONEncoder:
kwargs.setdefault("cls", cls)
kwargs.setdefault("cls", cls)
kwargs.setdefault("ensure_ascii", app.config["JSON_AS_ASCII"])
kwargs.setdefault("sort_keys", app.config["JSON_SORT_KEYS"])
else:
kwargs.setdefault("sort_keys", True)
kwargs.setdefault("cls", JSONEncoder)
def _load_arg_defaults(
kwargs: t.Dict[str, t.Any], app: t.Optional["Flask"] = None
) -> None:
"""Inject default arguments for load functions."""
if app is None:
app = current_app
if app:
cls = app.json_decoder
bp = app.blueprints.get(request.blueprint) if request else None # type: ignore
if bp is not None and bp.json_decoder is not None:
cls = bp.json_decoder
# Only set a custom decoder if it has custom behavior. This is
# faster on PyPy.
if cls not in {JSONDecoder, _json.JSONDecoder}:
kwargs.setdefault("cls", cls)
def dumps(obj: t.Any, app: t.Optional["Flask"] = None, **kwargs: t.Any) -> str:
"""Serialize an object to a string of JSON.
Takes the same arguments as the built-in :func:`json.dumps`, with
some defaults from application configuration.
:param obj: Object to serialize to JSON.
:param app: Use this app's config instead of the active app context
or defaults.
:param kwargs: Extra arguments passed to :func:`json.dumps`.
.. versionchanged:: 2.0.2
:class:`decimal.Decimal` is supported by converting to a string.
.. versionchanged:: 2.0
``encoding`` is deprecated and will be removed in Flask 2.1.
.. versionchanged:: 1.0.3
``app`` can be passed directly, rather than requiring an app
context for configuration.
"""
_dump_arg_defaults(kwargs, app=app)
encoding = kwargs.pop("encoding", None)
rv = _json.dumps(obj, **kwargs)
if encoding is not None:
warnings.warn(
"'encoding' is deprecated and will be removed in Flask 2.1.",
DeprecationWarning,
stacklevel=2,
)
if isinstance(rv, str):
return rv.encode(encoding) # type: ignore
return rv
def dump(
obj: t.Any, fp: t.IO[str], app: t.Optional["Flask"] = None, **kwargs: t.Any
) -> None:
"""Serialize an object to JSON written to a file object.
Takes the same arguments as the built-in :func:`json.dump`, with
some defaults from application configuration.
:param obj: Object to serialize to JSON.
:param fp: File object to write JSON to.
:param app: Use this app's config instead of the active app context
or defaults.
:param kwargs: Extra arguments passed to :func:`json.dump`.
.. versionchanged:: 2.0
Writing to a binary file, and the ``encoding`` argument, is
deprecated and will be removed in Flask 2.1.
"""
_dump_arg_defaults(kwargs, app=app)
encoding = kwargs.pop("encoding", None)
show_warning = encoding is not None
try:
fp.write("")
except TypeError:
show_warning = True
fp = io.TextIOWrapper(fp, encoding or "utf-8") # type: ignore
if show_warning:
warnings.warn(
"Writing to a binary file, and the 'encoding' argument, is"
" deprecated and will be removed in Flask 2.1.",
DeprecationWarning,
stacklevel=2,
)
_json.dump(obj, fp, **kwargs)
def loads(s: str, app: t.Optional["Flask"] = None, **kwargs: t.Any) -> t.Any:
"""Deserialize an object from a string of JSON.
Takes the same arguments as the built-in :func:`json.loads`, with
some defaults from application configuration.
:param s: JSON string to deserialize.
:param app: Use this app's config instead of the active app context
or defaults.
:param kwargs: Extra arguments passed to :func:`json.loads`.
.. versionchanged:: 2.0
``encoding`` is deprecated and will be removed in Flask 2.1. The
data must be a string or UTF-8 bytes.
.. versionchanged:: 1.0.3
``app`` can be passed directly, rather than requiring an app
context for configuration.
"""
_load_arg_defaults(kwargs, app=app)
encoding = kwargs.pop("encoding", None)
if encoding is not None:
warnings.warn(
"'encoding' is deprecated and will be removed in Flask 2.1."
" The data must be a string or UTF-8 bytes.",
DeprecationWarning,
stacklevel=2,
)
if isinstance(s, bytes):
s = s.decode(encoding)
return _json.loads(s, **kwargs)
def load(fp: t.IO[str], app: t.Optional["Flask"] = None, **kwargs: t.Any) -> t.Any:
"""Deserialize an object from JSON read from a file object.
Takes the same arguments as the built-in :func:`json.load`, with
some defaults from application configuration.
:param fp: File object to read JSON from.
:param app: Use this app's config instead of the active app context
or defaults.
:param kwargs: Extra arguments passed to :func:`json.load`.
.. versionchanged:: 2.0
``encoding`` is deprecated and will be removed in Flask 2.1. The
file must be text mode, or binary mode with UTF-8 bytes.
"""
_load_arg_defaults(kwargs, app=app)
encoding = kwargs.pop("encoding", None)
if encoding is not None:
warnings.warn(
"'encoding' is deprecated and will be removed in Flask 2.1."
" The file must be text mode, or binary mode with UTF-8"
" bytes.",
DeprecationWarning,
stacklevel=2,
)
if isinstance(fp.read(0), bytes):
fp = io.TextIOWrapper(fp, encoding) # type: ignore
return _json.load(fp, **kwargs)
def htmlsafe_dumps(obj: t.Any, **kwargs: t.Any) -> str:
"""Serialize an object to a string of JSON with :func:`dumps`, then
replace HTML-unsafe characters with Unicode escapes and mark the
result safe with :class:`~markupsafe.Markup`.
This is available in templates as the ``|tojson`` filter.
The returned string is safe to render in HTML documents and
``<script>`` tags. The exception is in HTML attributes that are
double quoted; either use single quotes or the ``|forceescape``
filter.
.. versionchanged:: 2.0
Uses :func:`jinja2.utils.htmlsafe_json_dumps`. The returned
value is marked safe by wrapping in :class:`~markupsafe.Markup`.
.. versionchanged:: 0.10
Single quotes are escaped, making this safe to use in HTML,
``<script>`` tags, and single-quoted attributes without further
escaping.
"""
return _jinja_htmlsafe_dumps(obj, dumps=dumps, **kwargs)
def htmlsafe_dump(obj: t.Any, fp: t.IO[str], **kwargs: t.Any) -> None:
"""Serialize an object to JSON written to a file object, replacing
HTML-unsafe characters with Unicode escapes. See
:func:`htmlsafe_dumps` and :func:`dumps`.
"""
fp.write(htmlsafe_dumps(obj, **kwargs))
def jsonify(*args: t.Any, **kwargs: t.Any) -> "Response":
"""Serialize data to JSON and wrap it in a :class:`~flask.Response`
with the :mimetype:`application/json` mimetype.
Uses :func:`dumps` to serialize the data, but ``args`` and
``kwargs`` are treated as data rather than arguments to
:func:`json.dumps`.
1. Single argument: Treated as a single value.
2. Multiple arguments: Treated as a list of values.
``jsonify(1, 2, 3)`` is the same as ``jsonify([1, 2, 3])``.
3. Keyword arguments: Treated as a dict of values.
``jsonify(data=data, errors=errors)`` is the same as
``jsonify({"data": data, "errors": errors})``.
4. Passing both arguments and keyword arguments is not allowed as
it's not clear what should happen.
.. code-block:: python
from flask import jsonify
@app.route("/users/me")
def get_current_user():
return jsonify(
username=g.user.username,
email=g.user.email,
id=g.user.id,
)
Will return a JSON response like this:
.. code-block:: javascript
{
"username": "admin",
"email": "admin@localhost",
"id": 42
}
The default output omits indents and spaces after separators. In
debug mode or if :data:`JSONIFY_PRETTYPRINT_REGULAR` is ``True``,
the output will be formatted to be easier to read.
.. versionchanged:: 2.0.2
:class:`decimal.Decimal` is supported by converting to a string.
.. versionchanged:: 0.11
Added support for serializing top-level arrays. This introduces
a security risk in ancient browsers. See :ref:`security-json`.
.. versionadded:: 0.2
"""
indent = None
separators = (",", ":")
if current_app.config["JSONIFY_PRETTYPRINT_REGULAR"] or current_app.debug:
indent = 2
separators = (", ", ": ")
if args and kwargs:
raise TypeError("jsonify() behavior undefined when passed both args and kwargs")
elif len(args) == 1: # single args are passed directly to dumps()
data = args[0]
else:
data = args or kwargs
return current_app.response_class(
f"{dumps(data, indent=indent, separators=separators)}n",
mimetype=current_app.config["JSONIFY_MIMETYPE"],
)
部署的yaml
代码语言:javascript复制apiVersion: v1
kind: Service
metadata:
name: superset-service
namespace: kyuubi
labels:
run: superset-service
spec:
ports:
- port: 80
protocol: TCP
name: http
- port: 8888
protocol: TCP
name: hue
selector:
run: superset-service
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: superset-service
namespace: kyuubi
spec:
selector:
matchLabels:
run: superset-service
replicas: 1
template:
metadata:
labels:
run: superset-service
spec:
containers:
- name: superset-service
image: dockerhub/mysoft:superset-v1.5
env:
- name: SUPERSET_PORT
value: "8888"
volumeMounts:
- name: superset-config
mountPath: /app/pythonpath/superset_config.py
subPath: superset_config.py
- name: superset-config
mountPath: /usr/local/lib/python3.8/site-packages/flask/json/__init__.py
subPath: __init__.py
volumes:
- name: superset-config
configMap:
name: superset-config
最终成果:(安了,没报错了)
四、题外话
原先使用的是hue,之所以使用superset的原因是之前使用的hive进行jdbc连接的时候参数基本就不需要动,但是改成kyuubi spark的时候需要对连接进行调优,(kyuubi spark替代hive后面会一篇介绍),例如默认设置的spark executor内存无法查询一些语句,这个时候需要在jdbc连接的时候调一下spark executor的内存参数,而hue不具备该功能,需要修改比较复杂的源代码和前端去实现。找了一圈,也就superset可以满足且界面用起来干净舒服。最后附上superset加连接参数的方法:
json结构如下:
代码语言:javascript复制{"connect_args":{"configuration":{"spark.executor.memory":"15000m","hive.server2.proxy.user":"gyx"}}}