【编写环境一】遇到常见python函数处理方式

1.python实现两个一维列表合并成一个二维列表

代码语言：javascript复制

>>> list1 = [1,2,3,4,4]
>>> list2 = [2,3,4,5,2]
>>> z = list(zip(list1,list2))
>>> z
[(1, 2), (2, 3), (3, 4), (4, 5), (4, 2)]
>>> z[1][1]
3
>>> z[0][1]
2
>>> z[2]
(3, 4)

代码语言：javascript复制

a= [1,2,3,4]
b= [5,6,7,8]
n = len(a)
for i in range(n):
   c.append([a[i],b[i]])
print(c)

2.Box() dict()可用于创建连续的空间

代码语言：javascript复制

from gym import spaces

# 规定action space最小值，最大值以及维度
action_space1 = spaces.Box(low=-1, high=1, shape=(1, ))
# 也可以输入向量，这样space输出的值会与输入向量具有相同维度，不需要另外规定维度（low和high要具有相同维度）
low = np.float32(np.zeros(3))
high = np.float32(np.ones(3))
action_space2 = spaces.Box(low=low,high=high)

# 随机进行采样，观察输出结果
print(action_space1.sample())
print(action_space2.sample())


输出结果：
[0.17719302]
[0.09906013 0.6137293  0.5404117 ]

*Dict()**可用字典（dictionary）的形式储存空间特征，因此可以描述更多特性并用于构建更为复杂的空间

代码语言：javascript复制

low = np.float32(np.zeros(3))
high = np.float32(np.ones(3))

# 创建有两个不同action的action space
action_space = spaces.Dict({
            "action1": spaces.Discrete(10),
            "action2": spaces.Box(low=low,high=high)
        })
print(action_space.sample())


输出结果：
OrderedDict([('action1', 6), ('action2', array([0.23961447, 0.6493422 , 0.267231  ], dtype=float32))])
可以看到action space中有 action1 和 action2 两个action

Discrete() 可用于创建离散的非负整数空间

代码语言：javascript复制

# 规定action space中总共有几个动作
# 例如在有名的例子“cartpole”中小车只有两个action选项：向左或向右，因此将数值设为2，则输出会为0或1
action_space = spaces.Discrete(2)

# 测试输出结果
for i in range(5):
	print(action_space.sample())

输出结果：
0
1
1
1
1

2.1 OpenAI Gym Discrete和Box spaces同时存在，代码该怎么写

代码语言：javascript复制

class SampleGym(gym.Env):
    def __init__(self, config={}):
        self.config = config
        #主要就是这个地方
        self.action_space = Tuple((Discrete(2), Box(-10, 10, (2,))))
        self.observation_space = Box(-10, 10, (2, 2))
        self.p_done = config.get("p_done", 0.1)

    def reset(self):
        return self.observation_space.sample()

    def step(self, action):
        chosen_action = action[0]
        cnt_control = action[1][chosen_action]

        if chosen_action == 0:
            reward = cnt_control
        else:
            reward = -cnt_control - 1

        print(f"Action, {chosen_action} continuous ctrl {cnt_control}")
        return (
            self.observation_space.sample(),
            reward,
            bool(np.random.choice([True, False], p=[self.p_done, 1.0 - self.p_done])),
            {},
        )


if __name__ == "__main__":
    env = SampleGym()
    env.reset()
    env.step((1, [-1, 2.1]))  # should say use action 1 with 2.1
    env.step((0, [-1.1, 2.1]))  # should say use action 0 with -1.1

2.2 gym中各种离散连续写法

Discrete： numbered from 0 to n-1. 举例， Discrete(n=4) 表示4个action上下左右。
Box：在 [low，high] 区间内的n维tensor。举例，Box(low=0.0, high=255, shape=(210,160,3), dtype=np.uint8) 表示3Dtensor with 100800 bytes。
MultiBinary: n-shape的binary space。举例，MultiBinary(5) 表示5维的0或1的数组。 MultiBinary([3,2]) 表示3x2维的0或1的数组。
MultiDiscrete：一系列离散的action space。举例，MultiDiscrete([5,2,2]) 表示三个discrete action space。
Tuple：用于combine一些space instance。举例，Tuple(spaces=(Box(low=-1.0, high=1.0, shape=(3,), dtype=np.float32), Discrete(n=3), Discrete(n=2))).
Dict：也是用于combine一些space instance。举例，Dict({'position':Discrete(2), 'velocity':Discrete(3)})。

代码语言：javascript复制

Example usage [nested]:
self.nested_observation_space = spaces.Dict({
    'sensors':  spaces.Dict({
        'position': spaces.Box(low=-100, high=100, shape=(3)),
        'velocity': spaces.Box(low=-1, high=1, shape=(3)),
        'front_cam': spaces.Tuple((
            spaces.Box(low=0, high=1, shape=(10, 10, 3)),
            spaces.Box(low=0, high=1, shape=(10, 10, 3))
        )),
        'rear_cam': spaces.Box(low=0, high=1, shape=(10, 10, 3)),
    }),
    'ext_controller': spaces.MultiDiscrete([ [0,4], [0,1], [0,1] ]),
    'inner_state':spaces.Dict({
        'charge': spaces.Discrete(100),
        'system_checks': spaces.MultiBinary(10),
        'job_status': spaces.Dict({
            'task': spaces.Discrete(5),
            'progress': spaces.Box(low=0, high=100, shape=()),
        })
    })
})

3.随机生成不重复的10个从1~10的整数

random.sample(range(0,10),10)；# 随机生成重复的整数np.random.randint(0,10,size=10)

代码语言：javascript复制

user_x = random.sample(range(0,10),10)  #随机生成不重复的10个从1~10的整数random.sample(range(0,10),10)
user_y = random.sample(range(0,10),10)  #随机生成不重复的10个从1~10的整数random.sample(range(0,10),10)
user_location=list(zip(user_x,user_y))
print(user_x,user_y)
print(user_location)
print(user_x[0],user_y[0])

代码语言：javascript复制

#随机生成数字
""" a=random.randint(1,100)
print(a) """


user_num=5
sum=0
#生成质保固定的
for i in range(user_num):
    a=random.randint(0,1)
    sum =a
    print(a,sum)

4.列表中二维元组开方情况求距离

代码语言：javascript复制

user_x = random.sample(range(0,10),10)  #随机生成不重复的10个从1~10的整数random.sample(range(0,10),10)
user_y = random.sample(range(0,10),10)  #随机生成不重复的10个从1~10的整数random.sample(range(0,10),10)
user_location=list(zip(user_x,user_y))
#开方有pow(x,y),sqrt ,math.hypot
p1=math.hypot((user_location[0][0]-user_location[1][0]),(user_location[0][1]-user_location[1][1]))
p=np.sqrt((user_location[0][0]-user_location[1][0])**2 (user_location[0][1]-user_location[1][1])**2)
print(user_x,user_y)

print(p,p1)

5.从列表中或数组中随机抽取固定数量的元素组成新的数组或列表

代码语言：javascript复制

>>> import random
>>> mylist=list(range(1,10))
>>> mylist
[1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> newlist = random.sample(mylist, 3)  #从mylist中随机获取3个元素
>>> newlist
[4, 7, 2]

6.numpy np.finfo()函数 eps max用法

finfo函数是根据括号中的类型来获得信息，获得符合这个类型的数型

代码语言：javascript复制

import numpy as np
a=np.array([[1],[2],[-1],[0]])
b=np.maximum(a,np.finfo(np.float32).eps)
print(b)

代码语言：javascript复制

[[1.0000000e 00]
 [2.0000000e 00]
 [1.1920929e-07]
 [1.1920929e-07]]

代码语言：javascript复制

ious = np.maximum(1.0 * inter_area / union_area, np.finfo(np.float32).eps)

eps是取非负的最小值。当计算的IOU为0或为负（但从代码上来看不太可能为负），使用np.finfo(np.float32).eps来替换

参考链接：https://www.freesion.com/article/6448186307/

7.列表形式转换成对角矩阵索引形式

7.1 scipy.linalg.block_diag()

如果环境中agent由多个对象组成，且每个对象都有其单独的转换矩阵，可以用scipy.linalg.block_diag()将多个矩阵合并为一个单独的对角矩阵，方便在step()中进行计算。我原本的方法是将每个对象保存的list中，list[i]对应第i个对象，这样容易造成step()中的代码过于冗长且要话费更多的计算时间，因为每次要调用多个对象时都需要写一个循环scipy.linalg.block_diag()

代码语言：javascript复制

from scipy.linalg import block_diag
A = [[1, 0],
     [0, 1]]
B = [[3, 4, 5],
     [6, 7, 8]]
C = [[7]]
P = np.zeros((2, 0), dtype='int32')
block_diag(A, B, C)
array([[1, 0, 0, 0, 0, 0],
       [0, 1, 0, 0, 0, 0],
       [0, 0, 3, 4, 5, 0],
       [0, 0, 6, 7, 8, 0],
       [0, 0, 0, 0, 0, 7]])
block_diag(A, P, B, C)
array([[1, 0, 0, 0, 0, 0],
       [0, 1, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0],
       [0, 0, 3, 4, 5, 0],
       [0, 0, 6, 7, 8, 0],
       [0, 0, 0, 0, 0, 7]])
block_diag(1.0, [2, 3], [[4, 5], [6, 7]])
array([[ 1.,  0.,  0.,  0.,  0.],
       [ 0.,  2.,  3.,  0.,  0.],
       [ 0.,  0.,  0.,  4.,  5.],
       [ 0.,  0.,  0.,  6.,  7.]])

7.2 numpy.ndarray.flatten

环境的observation如果需要的是一维向量，但是在step()中为了方便理解用的多维数组，可以使用obs.flatten()进行转换。

代码语言：javascript复制

a = np.array([[1,2], [3,4]])
a.flatten()
array([1, 2, 3, 4])
a.flatten('F')
array([1, 3, 2, 4])

https://numpy.org/doc/stable/reference/generated/numpy.ndarray.flatten.html

8. for 循环简单写法

代码语言：javascript复制

c = a if a>b else b   和下面等价  大于0就是真
            
 a, b, c = 1, 2, 3
 if a>b:
 c = a
 else:
c = b 
            
c = (a>b and a or b)
c= [b, a][a > b]   都等价 推荐用第一个

numpy 编程算法

0 人点赞