-
Notifications
You must be signed in to change notification settings - Fork 872
深度学习常用术语表
Chen Long edited this page Mar 30, 2021
·
1 revision
| 中文 | 英文 | 缩写 |
|---|---|---|
| 深度学习 | deep learning | |
| 机器学习 | machine learning | |
| 机器学习模型 | machine learning model | |
| 逻辑回归 | logistic regression | |
| 回归 | regression | |
| 人工智能 | artificial intelligence | |
| 朴素贝叶斯 | naive Bayes | |
| 表示 | representation | |
| 表示学习 | representation learning | |
| 自编码器 | autoencoder | |
| 编码器 | encoder | |
| 解码器 | decoder | |
| 多层感知机 | multilayer perceptron | |
| 人工神经网络 | artificial neural network | |
| 神经网络 | neural network | |
| 随机梯度下降 | stochastic gradient descent | SGD |
| 线性模型 | linear model | |
| 线性回归 | linear regression | |
| 整流线性单元 | rectified linear unit | ReLU |
| 分布式表示 | distributed representation | |
| 非分布式表示 | nondistributed representation | |
| 非分布式 | nondistributed | |
| 隐藏单元 | hidden unit | |
| 长短期记忆 | long short-term memory | LSTM |
| 深度信念网络 | deep belief network | DBN |
| 循环神经网络 | recurrent neural network | RNN |
| 循环 | recurrence | |
| 强化学习 | reinforcement learning | |
| 推断 | inference | |
| 上溢 | overflow | |
| 下溢 | underflow | |
| softmax函数 | softmax function | |
| softmax | softmax | |
| 欠估计 | underestimation | |
| 过估计 | overestimation | |
| 病态条件 | poor conditioning | |
| 目标函数 | objective function | |
| 目标 | objective | |
| 准则 | criterion | |
| 代价函数 | cost function | |
| 代价 | cost | |
| 损失函数 | loss function | |
| PR曲线 | PR curve | |
| F值 | F-score | |
| 损失 | loss | |
| 误差函数 | error function | |
| 梯度下降 | gradient descent | |
| 导数 | derivative | |
| 临界点 | critical point | |
| 驻点 | stationary point | |
| 局部极小点 | local minimum | |
| 极小点 | minimum | |
| 局部极小值 | local minima | |
| 极小值 | minima | |
| 全局极小值 | global minima | |
| 局部极大值 | local maxima | |
| 极大值 | maxima | |
| 局部极大点 | local maximum | |
| 鞍点 | saddle point | |
| 全局最小点 | global minimum | |
| 偏导数 | partial derivative | |
| 梯度 | gradient | |
| 样本 | example | |
| 二阶导数 | second derivative | |
| 曲率 | curvature | |
| 凸优化 | Convex optimization | |
| 非凸 | nonconvex | |
| 数值优化 | numerical optimization | |
| 约束优化 | constrained optimization | |
| 可行 | feasible | |
| 等式约束 | equality constraint | |
| 不等式约束 | inequality constraint | |
| 正则化 | regularization | |
| 正则化项 | regularizer | |
| 正则化 | regularize | |
| 泛化 | generalization | |
| 泛化 | generalize | |
| 欠拟合 | underfitting | |
| 过拟合 | overfitting | |
| 偏差 | biass | |
| 方差 | variance | |
| 集成 | ensemble | |
| 估计 | estimator | |
| 权重衰减 | weight decay | |
| 协方差 | covariance | |
| 稀疏 | sparse | |
| 特征选择 | feature selection | |
| 特征提取器 | feature extractor | |
| 最大后验 | Maximum A Posteriori | MAP |
| 池化 | pooling | |
| Dropout | Dropout | |
| 蒙特卡罗 | Monte Carlo | |
| 提前终止 | early stopping | |
| 卷积神经网络 | convolutional neural network | CNN |
| 小批量 | minibatch | |
| 重要采样 | Importance Sampling | |
| 变分自编码器 | variational auto-encoder | VAE |
| 计算机视觉 | Computer Vision | CV |
| 语音识别 | Speech Recognition | |
| 自然语言处理 | Natural Language Processing | NLP |
| 有向模型 | Directed Model | |
| 原始采样 | Ancestral Sampling | |
| 随机矩阵 | Stochastic Matrix | |
| 平稳分布 | Stationary Distribution | |
| 均衡分布 | Equilibrium Distribution | |
| 索引 | index of matrix | |
| 磨合 | Burning-in | |
| 混合时间 | Mixing Time | |
| 混合 | Mixing | |
| Gibbs采样 | Gibbs Sampling | |
| 吉布斯步数 | Gibbs steps | |
| Bagging | bootstrap aggregating | |
| 掩码 | mask | |
| 批标准化 | batch normalization | |
| 参数共享 | parameter sharing | |
| KL散度 | KL divergence | |
| 温度 | temperature | |
| 临界温度 | critical temperatures | |
| 并行回火 | parallel tempering | |
| 自动语音识别 | Automatic Speech Recognition | ASR |
| 级联 | coalesced | |
| 数据并行 | data parallelism | |
| 模型并行 | model parallelism | |
| 异步随机梯度下降 | Asynchoronous Stochastic Gradient Descent | |
| 参数服务器 | parameter server | |
| 模型压缩 | model compression | |
| 动态结构 | dynamic structure | |
| 隐马尔可夫模型 | Hidden Markov Model | HMM |
| 高斯混合模型 | Gaussian Mixture Model | GMM |
| 转录 | transcribe | |
| 主成分分析 | principal components analysis | PCA |
| 因子分析 | factor analysis | |
| 独立成分分析 | independent component analysis | ICA |
| 稀疏编码 | sparse coding | |
| 定点运算 | fixed-point arithmetic | |
| 浮点运算 | float-point arithmetic | |
| 生成模型 | generative model | |
| 生成式建模 | generative modeling | |
| 数据集增强 | dataset augmentation | |
| 白化 | whitening | |
| 深度神经网络 | DNN | |
| 端到端的 | end-to-end | |
| 图模型 | graphical model | |
| 有向图模型 | directed graphical model | |
| 依赖 | dependency | |
| 贝叶斯网络 | Bayesian network | |
| 模型平均 | model averaging | |
| 声明 | statement | |
| 量子力学 | quantum mechanics | |
| 亚原子 | subatomic | |
| 逼真度 | fidelity | |
| 信任度 | degree of belief | |
| 频率派概率 | frequentist probability | |
| 贝叶斯概率 | Bayesian probability | |
| 似然 | likelihood | |
| 随机变量 | random variable | |
| 概率分布 | probability distribution | |
| 联合概率分布 | joint probability distribution | |
| 归一化的 | normalized | |
| 均匀分布 | uniform distribution | |
| 概率密度函数 | probability density function | |
| 累积函数 | cumulative function | |
| 边缘概率分布 | marginal probability distribution | |
| 求和法则 | sum rule | |
| 条件概率 | conditional probability | |
| 干预查询 | intervention query | |
| 因果模型 | causal modeling | |
| 因果因子 | causal factor | |
| 链式法则 | chain rule | |
| 乘法法则 | product rule | |
| 相互独立的 | independent | |
| 条件独立的 | conditionally independent | |
| 期望 | expectation | |
| 期望值 | expected value | |
| 样本 | example | |
| 特征 | feature | |
| 准确率 | accuracy | |
| 错误率 | error rate | |
| 训练集 | training set | |
| 解释因子 | explanatory factort | |
| 潜在 | underlying | |
| 潜在成因 | underlying cause | |
| 测试集 | test set | |
| 性能度量 | performance measures | |
| 经验 | experience | |
| 无监督 | unsupervised | |
| 有监督 | supervised | |
| 半监督 | semi-supervised | |
| 监督学习 | supervised learning | |
| 无监督学习 | unsupervised learning | |
| 数据集 | dataset | |
| 数据点 | data point | |
| 标签 | label | |
| 标注 | labeled | |
| 未标注 | unlabeled | |
| 目标 | target | |
| 强化学习 | reinforcement learning | |
| 设计矩阵 | design matrix | |
| 参数 | parameter | |
| 权重 | weight | |
| 均方误差 | mean squared error | MSE |
| 正规方程 | normal equation | |
| 训练误差 | training error | |
| 泛化误差 | generalization error | |
| 测试误差 | test error | |
| 假设空间 | hypothesis space | |
| 容量 | capacity | |
| 表示容量 | representational capacity | |
| 有效容量 | effective capacity | |
| 线性阈值单元 | linear threshold units | |
| 非参数 | non-parametric | |
| 最近邻回归 | nearest neighbor regression | |
| 最近邻 | nearest neighbor | |
| 验证集 | validation set | |
| 基准 | bechmark | |
| 基准 | baseline | |
| 点估计 | point estimator | |
| 估计量 | estimator | |
| 统计量 | statistics | |
| 无偏 | unbiased | |
| 有偏 | biased | |
| 异步 | asynchronous | |
| 渐近无偏 | asymptotically unbiased | |
| 标准差 | standard error | |
| 一致性 | consistency | |
| 统计效率 | statistic efficiency | |
| 有参情况 | parametric case | |
| 贝叶斯统计 | Bayesian statistics | |
| 先验概率分布 | prior probability distribution | |
| 最大后验 | maximum a posteriori | |
| 最大似然估计 | maximum likelihood estimation | |
| 最大似然 | maximum likelihood | |
| 核技巧 | kernel trick | |
| 核函数 | kernel function | |
| 高斯核 | Gaussian kernel | |
| 核机器 | kernel machine | |
| 核方法 | kernel method | |
| 支持向量 | support vector | |
| 支持向量机 | support vector machine | SVM |
| 音素 | phoneme | |
| 声学 | acoustic | |
| 语音 | phonetic | |
| 专家混合体 | mixture of experts | |
| 高斯混合体 | Gaussian mixtures | |
| 选通器 | gater | |
| 专家网络 | expert network | |
| 注意力机制 | attention mechanism | |
| 对抗样本 | adversarial example | |
| 对抗 | adversarial | |
| 对抗训练 | adversarial training | |
| 切面距离 | tangent distance | |
| 正切传播 | tangent prop | |
| 正切传播 | tangent propagation | |
| 双反向传播 | double backprop | |
| 期望最大化 | expectation maximization | EM |
| 均值场 | mean-field | |
| 变分推断 | variational inference | |
| 二值稀疏编码 | binary sparse coding | |
| 前馈网络 | feedforward network | |
| 转移 | transition | |
| 重构 | reconstruction | |
| 生成随机网络 | generative stochastic network | |
| 得分匹配 | score matching | |
| 因子 | factorial | |
| 分解的 | factorized | |
| 均匀场 | meanfield | |
| 最大似然估计 | maximum likelihood estimation | |
| 概率PCA | probabilistic PCA | |
| 随机梯度上升 | Stochastic Gradient Ascent | |
| 团 | clique | |
| Dirac分布 | dirac distribution | |
| 不动点方程 | fixed point equation | |
| 变分法 | calculus of variations | |
| 信念网络 | belief network | |
| 马尔可夫随机场 | Markov random field | |
| 马尔可夫网络 | Markov network | |
| 对数线性模型 | log-linear model | |
| 自由能 | free energy | |
| 局部条件概率分布 | local conditional probability distribution | |
| 条件概率分布 | conditional probability distribution | |
| 玻尔兹曼分布 | Boltzmann distribution | |
| 吉布斯分布 | Gibbs distribution | |
| 能量函数 | energy function | |
| 标准差 | standard deviation | |
| 相关系数 | correlation | |
| 标准正态分布 | standard normal distribution | |
| 协方差矩阵 | covariance matrix | |
| Bernoulli分布 | Bernoulli distribution | |
| Bernoulli输出分布 | Bernoulli output distribution | |
| Multinoulli分布 | multinoulli distribution | |
| Multinoulli输出分布 | multinoulli output distribution | |
| 范畴分布 | categorical distribution | |
| 多项式分布 | multinomial distribution | |
| 正态分布 | normal distribution | |
| 高斯分布 | Gaussian distribution | |
| 精度 | precision | |
| 多维正态分布 | multivariate normal distribution | |
| 精度矩阵 | precision matrix | |
| 各向同性 | isotropic | |
| 指数分布 | exponential distribution | |
| 指示函数 | indicator function | |
| 广义函数 | generalized function | |
| 经验分布 | empirical distribution | |
| 经验频率 | empirical frequency | |
| 混合分布 | mixture distribution | |
| 潜变量 | latent variable | |
| 隐藏变量 | hidden variable | |
| 先验概率 | prior probability | |
| 后验概率 | posterior probability | |
| 万能近似器 | universal approximator | |
| 饱和 | saturate | |
| 分对数 | logit | |
| 正部函数 | positive part function | |
| 负部函数 | negative part function | |
| 贝叶斯规则 | Bayes' rule | |
| 测度论 | measure theory | |
| 零测度 | measure zero | |
| Jacobian矩阵 | Jacobian matrix | |
| 自信息 | self-information | |
| 奈特 | nats | |
| 比特 | bit | |
| 香农 | shannons | |
| 香农熵 | Shannon entropy | |
| 微分熵 | differential entropy | |
| 微分方程 | differential equation | |
| KL散度 | Kullback-Leibler (KL) divergence | |
| 交叉熵 | cross-entropy | |
| 熵 | entropy | |
| 分解 | factorization | |
| 结构化概率模型 | structured probabilistic model | |
| 图模型 | graphical model | |
| 回退 | back-off | |
| 有向 | directed | |
| 无向 | undirected | |
| 无向图模型 | undirected graphical model | |
| 成比例 | proportional | |
| 描述 | description | |
| 决策树 | decision tree | |
| 因子图 | factor graph | |
| 结构学习 | structure learning | |
| 环状信念传播 | loopy belief propagation | |
| 卷积网络 | convolutional network | |
| 卷积网络 | convolutional net | |
| 主对角线 | main diagonal | |
| 转置 | transpose | |
| 广播 | broadcasting | |
| 矩阵乘积 | matrix product | |
| AdaGrad | AdaGrad | |
| 逐元素乘积 | element-wise product | |
| Hadamard乘积 | Hadamard product | |
| 团势能 | clique potential | |
| 因子 | factor | |
| 未归一化概率函数 | unnormalized probability function | |
| 循环网络 | recurrent network | |
| 梯度消失与爆炸问题 | vanishing and exploding gradient problem | |
| 梯度消失 | vanishing gradient | |
| 梯度爆炸 | exploding gradient | |
| 计算图 | computational graph | |
| 展开 | unfolding | |
| 求逆 | invert | |
| 时间步 | time step | |
| 维数灾难 | curse of dimensionality | |
| 平滑先验 | smoothness prior | |
| 局部不变性先验 | local constancy prior | |
| 局部核 | local kernel | |
| 流形 | manifold | |
| 流形正切分类器 | manifold tangent classifier | |
| 流形学习 | manifold learning | |
| 流形假设 | manifold hypothesis | |
| 环 | loop | |
| 弦 | chord | |
| 弦图 | chordal graph | |
| 三角形化图 | triangulated graph | |
| 三角形化 | triangulate | |
| 风险 | risk | |
| 经验风险 | empirical risk | |
| 经验风险最小化 | empirical risk minimization | |
| 代理损失函数 | surrogate loss function | |
| 批量 | batch | |
| 确定性 | deterministic | |
| 随机 | stochastic | |
| 在线 | online | |
| 流 | stream | |
| 梯度截断 | gradient clipping | |
| 幂方法 | power method | |
| 前向传播 | forward propagation | |
| 反向传播 | backward propagation | |
| 展开图 | unfolded graph | |
| 深度前馈网络 | deep feedforward network | |
| 前馈神经网络 | feedforward neural network | |
| 前向 | feedforward | |
| 反馈 | feedback | |
| 网络 | network | |
| 深度 | depth | |
| 输出层 | output layer | |
| 隐藏层 | hidden layer | |
| 宽度 | width | |
| 单元 | unit | |
| 激活函数 | activation function | |
| 反向传播 | back propagation | backprop |
| 泛函 | functional | |
| 平均绝对误差 | mean absolute error | |
| 赢者通吃 | winner-take-all | |
| 异方差 | heteroscedastic | |
| 混合密度网络 | mixture density network | |
| 梯度截断 | clip gradient | |
| 绝对值整流 | absolute value rectification | |
| 渗漏整流线性单元 | Leaky ReLU | |
| 参数化整流线性单元 | parametric ReLU | PReLU |
| maxout单元 | maxout unit | |
| 硬双曲正切函数 | hard tanh | |
| 架构 | architecture | |
| 操作 | operation | |
| 符号 | symbol | |
| 数值 | numeric value | |
| 动态规划 | dynamic programming | |
| 自动微分 | automatic differentiation | |
| 并行分布式处理 | Parallel Distributed Processing | |
| 稀疏激活 | sparse activation | |
| 衰减 | damping | |
| 学成 | learned | |
| 信息传输 | message passing | |
| 泛函导数 | functional derivative | |
| 变分导数 | variational derivative | |
| 额外误差 | excess error | |
| 动量 | momentum | |
| 混沌 | chaos | |
| 稀疏初始化 | sparse initialization | |
| 共轭方向 | conjugate directions | |
| 共轭 | conjugate | |
| 条件独立 | conditionally independent | |
| 集成学习 | ensemble learning | |
| 独立子空间分析 | independent subspace analysis | |
| 慢特征分析 | slow feature analysis | SFA |
| 慢性原则 | slowness principle | |
| 整流线性 | rectified linear | |
| 整流网络 | rectifier network | |
| 坐标下降 | coordinate descent | |
| 坐标上升 | coordinate ascent | |
| 预训练 | pretraining | |
| 无监督预训练 | unsupervised pretraining | |
| 逐层的 | layer-wise | |
| 贪心算法 | greedy algorithm | |
| 贪心 | greedy | |
| 精调 | fine-tuning | |
| 课程学习 | curriculum learning | |
| 召回率 | recall | |
| 覆盖 | coverage | |
| 超参数优化 | hyperparameter optimization | |
| 超参数 | hyperparameter | |
| 网格搜索 | grid search | |
| 有限差分 | finite difference | |
| 中心差分 | centered difference | |
| 储层计算 | reservoir computing | |
| 谱半径 | spectral radius | |
| 收缩 | contractive | |
| 长期依赖 | long-term dependency | |
| 跳跃连接 | skip connection | |
| 门控RNN | gated RNN | |
| 门控 | gated | |
| 卷积 | convolution | |
| 输入 | input | |
| 输入分布 | input distribution | |
| 输出 | output | |
| 特征映射 | feature map | |
| 翻转 | flip | |
| 稀疏交互 | sparse interactions | |
| 等变表示 | equivariant representations | |
| 稀疏连接 | sparse connectivity | |
| 稀疏权重 | sparse weights | |
| 接受域 | receptive field | |
| 绑定的权重 | tied weights | |
| 等变 | equivariance | |
| 探测级 | detector stage | |
| 符号表示 | symbolic representation | |
| 池化函数 | pooling function | |
| 最大池化 | max pooling | |
| 池 | pool | |
| 不变 | invariant | |
| 步幅 | stride | |
| 降采样 | downsampling | |
| 全 | full | |
| 非共享卷积 | unshared convolution | |
| 平铺卷积 | tiled convolution | |
| 循环卷积网络 | recurrent convolutional network | |
| 傅立叶变换 | Fourier transform | |
| 可分离的 | separable | |
| 初级视觉皮层 | primary visual cortex | |
| 简单细胞 | simple cell | |
| 复杂细胞 | complex cell | |
| 象限对 | quadrature pair | |
| 门控循环单元 | gated recurrent unit | GRU |
| 门控循环网络 | gated recurrent net | |
| 遗忘门 | forget gate | |
| 截断梯度 | clipping the gradient | |
| 记忆网络 | memory network | |
| 神经网络图灵机 | neural Turing machine | NTM |
| 精调 | fine-tune | |
| 共因 | common cause | |
| 编码 | code | |
| 再循环 | recirculation | |
| 欠完备 | undercomplete | |
| 完全图 | complete graph | |
| 欠定的 | underdetermined | |
| 过完备 | overcomplete | |
| 去噪 | denoising | |
| 去噪 | denoise | |
| 重构误差 | reconstruction error | |
| 梯度场 | gradient field | |
| 得分 | score | |
| 切平面 | tangent plane | |
| 最近邻图 | nearest neighbor graph | |
| 嵌入 | embedding | |
| 近似推断 | approximate inference | |
| 信息检索 | information retrieval | |
| 语义哈希 | semantic hashing | |
| 降维 | dimensionality reduction | |
| 对比散度 | contrastive divergence | |
| 语言模型 | language model | |
| 标记 | token | |
| 一元语法 | unigram | |
| 二元语法 | bigram | |
| 三元语法 | trigram | |
| 平滑 | smoothing | |
| 级联 | cascade | |
| 模型 | model | |
| 层 | layer | |
| 半监督学习 | semi-supervised learning | |
| 监督模型 | supervised model | |
| 词嵌入 | word embedding | |
| one-hot | one-hot | |
| 监督预训练 | supervised pretraining | |
| 迁移学习 | transfer learning | |
| 学习器 | learner | |
| 多任务学习 | multitask learning | |
| 领域自适应 | domain adaption | |
| 一次学习 | one-shot learning | |
| 零次学习 | zero-shot learning | |
| 零数据学习 | zero-data learning | |
| 多模态学习 | multimodal learning | |
| 生成式对抗网络 | generative adversarial network | GAN |
| 前馈分类器 | feedforward classifier | |
| 线性分类器 | linear classifier | |
| 正相 | positive phase | |
| 负相 | negative phase | |
| 随机最大似然 | stochastic maximum likelihood | |
| 噪声对比估计 | noise-contrastive estimation | NCE |
| 噪声分布 | noise distribution | |
| 噪声 | noise | |
| 独立同分布 | independent identically distributed | |
| 专用集成电路 | application-specific integrated circuit | ASIC |
| 现场可编程门阵列 | field programmable gated array | FPGA |
| 标量 | scalar | |
| 向量 | vector | |
| 矩阵 | matrix | |
| 张量 | tensor | |
| 点积 | dot product | |
| 内积 | inner product | |
| 方阵 | square | |
| 奇异的 | singular | |
| 范数 | norm | |
| 三角不等式 | triangle inequality | |
| 欧几里得范数 | Euclidean norm | |
| 最大范数 | max norm | |
| 对角矩阵 | diagonal matrix | |
| 对称 | symmetric | |
| 单位向量 | unit vector | |
| 单位范数 | unit norm | |
| 正交 | orthogonal | |
| 正交矩阵 | orthogonal matrix | |
| 标准正交 | orthonormal | |
| 特征分解 | eigendecomposition | |
| 特征向量 | eigenvector | |
| 特征值 | eigenvalue | |
| 分解 | decompose | |
| 正定 | positive definite | |
| 负定 | negative definite | |
| 半负定 | negative semidefinite | |
| 半正定 | positive semidefinite | |
| 奇异值分解 | singular value decomposition | SVD |
| 奇异值 | singular value | |
| 奇异向量 | singular vector | |
| 单位矩阵 | identity matrix | |
| 矩阵逆 | matrix inversion | |
| 原点 | origin | |
| 线性组合 | linear combination | |
| 列空间 | column space | |
| 值域 | range | |
| 线性相关 | linear dependency | |
| 线性无关 | linearly independent | |
| 列 | column | |
| 行 | row | |
| 同分布的 | identically distributed | |
| 词嵌入 | word embedding | |
| 机器翻译 | machine translation | |
| 推荐系统 | recommender system | |
| 词袋 | bag of words | |
| 协同过滤 | collaborative filtering | |
| 探索 | exploration | |
| 策略 | policy | |
| 关系 | relation | |
| 属性 | attribute | |
| 词义消歧 | word-sense disambiguation | |
| 误差度量 | error metric | |
| 性能度量 | performance metrics | |
| 共轭梯度 | conjugate gradient | |
| 在线学习 | online learning | |
| 逐层预训练 | layer-wise pretraining | |
| 自回归网络 | auto-regressive network | |
| 生成器网络 | generator network | |
| 判别器网络 | discriminator network | |
| 矩 | moment | |
| 可见层 | visible layer | |
| 无限 | infinite | |
| 容差 | tolerance | |
| 学习率 | learning rate | |
| 轮数 | epochs | |
| 轮 | epoch | |
| 对数尺度 | logarithmic scale | |
| 随机搜索 | random search | |
| 分段 | piecewise | |
| 汉明距离 | Hamming distance | |
| 可见变量 | visible variable | |
| 近似推断 | approximate inference | |
| 精确推断 | exact inference | |
| 潜层 | latent layer | |
| 知识图谱 | knowledge graph |