私の備忘録がないわね...私の...

画像処理とかプログラミングのお話。

Adversarial Training 周りの話まとめ

簡単のため, 以下のような略語を使用する.

  • AE: Adversarial Examples
  • AA: Adversarial Attack
  • clean: AAを受けていない自然画像
  • AT: Adversarial Training
  • AR: Adversarial Robustness
  • BN: Batch Normalization

EXPLAINING AND HARNESSING ADVERSARIAL EXAMPLES [Goodfellow+, ICLR15]

  • clean と AE を混ぜて学習する AT を初めて提案

Improving back-propagation by adding an adversarial gradient [Nøkland, arXiv15]

Learning with a strong adversary [Huang+, arXiv15]

Adversarial machine learning at scale [Kurakin+, ICLR17]

  • label leaking: AT されたモデルが clean よりもむしろ AE に対して優れた性能を出してしまうこと. しかしこれは PGD などを使うと見られない.
  • label leaking を考えると以下の二つの戦略で AT を行うのが良い.
    • FGSM などの one step 手法を使うとき, 真のラベルを利用しないでモデルの予測するラベルを使った方がいい.
    • PGD などの multi step 手法を使うのが良い.
  • モデルの容量を増やすことで AR が上がる.

Understanding Adversarial Training: Increasing Local Stability of Neural Nets through Robust Optimization [Shaham+, Neurocomputing18]

Ensemble Adversarial Training: Attacks and Defenses [Tramèr+, ICLR18]

  • 複数の他のモデルで生成された AE を使う AT

Towards Deep Learning Models Resistant to Adversarial Attacks [Madry+, ICLR18]

  • clean を使わずに AE だけで学習する AT を提案
  • AR には clean を分類するより遥かに大きな容量が必要となる.

Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples [Athalye+, ICML18]

  • Obfuscated Gradients (難読化された勾配) に強く依存する防御が多くあるが, これは BPDA という attack で破ることができる.
  • Backward Pass Differentiable Attack (BPDA): 勾配が計算できないレイヤーを勾配が1となるような別のレイヤーで置き換える. また不確実性を扱うモデルは期待値から勾配を予測する.
  • 以下は obfuscated gradients を利用した防御
    • Thermometer encoding: One hot way to resist adversarial examples [Buckman+, ICLR18]
    • Characterizing adversarial subspaces using local intrinsic dimensionality [Ma+, ICLR18]
    • Countering adversarial images using input transformations [Guo+, ICLR18]
    • Stochastic activation pruning for robust adversarial defense [Dhillon+, ICLR18]
    • Mitigating adversarial effects through randomization [Xie+, ICLR18]
    • Pixeldefend: Leveraging generative models to understand and defend against adversarial examples [Song+, ICLR18]
    • Defensegan: Protecting classifiers against adversarial attacks using generative models [Samangouei+, ICLR18]
  • ちゃんとした防御
    • Towards deep learning models resistant to adversarial attacks [Madry+, ICLR18]
    • Cascade adversarial machine learning regularized with a unified embedding [Na+, ICLR18]

Adversarial logit pairing [Kannan+, arXiv18]

  • 以下の二つの手法で AR が上がる.
    • adversarial logit pairing: clean と AE の logit を似るように学習
    • clean logit pairing: clean と clean (同じクラスですらない) の logit を似るように学習

Adversarial Training and Robustness for Multiple Perturbations [Tramèr, NeurIPS19]

  • 複数の攻撃種類で AT されたモデルが, それぞれの攻撃に対して個別に AT したモデルと同等の頑健性を達成できないことを示す。

Adversarial Training for Free! [Shafahi+, NeurIPS19]

  • モデルのパラメータを更新する際に計算された勾配情報を再利用することで, AE を生成するためのコストを排除した

You Only Propagate Once: Accelerating Adversarial Training via Maximal Principle [Zhang+, NeurIPS19]

  • ネットワークの第1層の入出力だけを使うことで AT の時間を減らす.

Towards the first adversarially robust neural network model on MNIST [Schott+, ICLR19]

  • AT は使われた攻撃以外の攻撃に強くない.
  • OOD を高い確信度であるクラスへと分類する.

Interpreting Adversarially Trained Convolutional Neural Networks [Zhang+, ICML19]

  • AT されたモデルはテクスチャ・バイアスを緩和し, より形状を学習する.
  • patch shuffle などにも実験をした.

On the Connection Between Adversarial Robustness and Saliency Map Interpretability [Etmann+, ICML19]

  • AR が高いモデルは saliency の解釈可能性が高い.
  • NN の非線形性がこの関係を弱めることをリプシッツ正則化を用いて学習したモデルを以って示した.
  • 判定境界への距離が大きくなると、アライメントも大きくなる

On the Convergence and Robustness of Adversarial Training [Wang+, ICML19]

  • AT の初期段階では収束品質 (AE の強さのようなもの) の良い AE を使う必要はなく, むしろ AR を下げる可能性がある.
  • AT の後期段階では収束品質の良い AE を使う必要がある.
  • 段々, 収束品質の良い AE を使うような AT を提案.
  • 初期段階で非常に難しい AE を含めると、DNN の特徴学習が阻害されてしまう可能性があることを示唆している.

Fast is better than free: Revisiting adversarial training [Wong+, ICLR20]

  • FGSM の AT にランダム初期化を加えることで, PGD の AT と同等の効果を得られる. これによって AT にかかる時間が削減できる.
  • FGSM の AT は周期的学習率や混合精度学習などで劇的に加速することができる.
  • PGD に対して FGSM の AT が失敗したのは catastrophic overfitting (突然, トレーニングデータに対するPGD への耐性が 0% になるような現象) によるもの. early stopping などで避けることができる.
  • FGSM の AT が失敗する原因は多数ある.
    • 初期化がゼロ
    • ステップサイズが大きすぎる
    • 特定の学習率スケジュールやエポック数

Improving adversarial robustness requires revisiting misclassified examples [Wang+, ICLR20]

INTRIGUING PROPERTIES OF ADVERSARIAL TRAINING AT SCALE [Xie+, ICLR20]

  • BN は AT の際に学習の邪魔をする可能性がある. BN を用いてATしたモデルでは学習データから clean な画像を取り除くだけで, AR が18.3%と大きく向上する.
  • clean と AE に別々のBNを適用することで, より強いARを達成できる.
  • ネットワークをどんどん深くしていっても自然画像に対する精度は変化しないが, AT では深くするほど AR が上がる.

Max-margin adversarial (mma) training: Direct input space margin maximization through adversarial training [Ding+, ICLR20]

  • ネットワークのマージンを最大化する

Concise Explanations of Neural Networks using Adversarial Training [Chalasani, ICML20]

Smooth Adversarial Training [Xie+, arXiv20]

  • ReLU の非平滑性は AT を大幅に弱める.
  • Smooth Adversarial Training (SAT): ReLUを滑らかな近似関数に置き換える. 例えば parametric softplus.
  • 滑らかな活性化関数はより困難な敵対例を見つけるのを手助けする.
  • 複数 step の攻撃で ReLU の問題は防げるかもしれない.
  • 最終的に最も AR が良かったのは SILU(x) = x · sigmoid(x)

Identifying Layers Susceptible to Adversarial Attacks [Siddiqui+, arXiv21]

  • モデルをいくつかの部分に分け, 再学習. 結果, AEの影響を受けやすいのは feature extractor である前半の層だった. なので後半の層を AT するだけではいけない. 逆に初期の層だけで AT すれば必要十分.
  • 普通のモデルに対して生成された AE の特徴量は clean なものと本質的に異なる.
  • AT は AE から抽出される特徴量を clean から抽出される特徴量と同じ分布にするように初期層の重みを変える.
  • AT されたモデルに対して生成される AE は初期層で clean を模倣する.

Boosting Fast Adversarial Training with Learnable Adversarial Initialization [Jia+, arXiv21]

  • Fast AT の初期化を工夫.

Fixing Data Augmentation to Improve Adversarial Robustness [Rebuff+, arXiv21]

  • モデルの重みの平均化とデータ増強 (cutout, cutmix, mixup) を組み合わせるとARが大幅に向上する. cutmixが最も効果的.
  • 最先端の生成モデルを使うともっと良くなる.

Docker上JupyterLabのterminalがおかしい

普通、JupyterLabのterminalは以下のような感じなのだが、

username@hostname:~/$ 

なぜかDocker上でterminalを起動すると設定ファイルが全く同じでも以下のようになる。

# 

これだと自分がどこにいるのか分かり辛いし、他にも矢印キーで過去のコマンドが使えないとか色々不便。

これは .jupyter/jupyter_notebook_config.py の以下の一文を

c.NotebookApp.terminado_settings = {}

以下のように変えればいい。

c.NotebookApp.terminado_settings = {'shell_command': ['/bin/bash']}

最小の操作でGASとslack連携

目的

GASでslackにメッセージを送る。また毎日自動的にメッセージを送るようにする。

GASでslackにメッセージを送る

  1. https://api.slack.com/ にアクセス
  2. App NameとWorkspaceを入力
  3. Add features and functionalityを選択
  4. Incoming Webhooksをon
  5. Install your appでメッセージを送信するチャンネルを選択する
  6. 左のタブのIncoming Webhooksを選択
  7. Webhook URLをコピー
  8. GASで以下を入力, URLは上のWebhook URLを入力
  9. 実行
let text = "test"
let url = "https://..."

let options =
    {
      "method" : "post",
      "contentType" : "application/json",
      "payload" : JSON.stringify(
        {
          "text" : text,
          link_names: 1
        }
      )
    };
  UrlFetchApp.fetch(url, options);

毎日自動的にメッセージを送るようにする。

  1. GAS左タブのトリガーを選択
  2. トリガーを追加
  3. 「イベントのソースを選択」->「時間主導型」
  4. 「時間ベースのトリガーのタイプを選択」 -> 「日付ベースのタイマー」
  5. 時刻は好きに選ぶ

IEEE Symposium on Security & Privacy16~20のadversarial examples関連論文リンク集

目視で判断したので、間違っていたり抜けてたりするかもしれませんが、ご容赦ください。

20

Humpty Dumpty: Controlling Word Meanings via Corpus Poisoning*

[1911.02142] Intriguing Properties of Adversarial ML Attacks in the Problem Space

19

Stealthy Porn: Understanding Real-World Adversarial Images for Illicit Online Promotion - IEEE Conference Publication

On the Feasibility of Rerouting-Based DDoS Defenses - IEEE Conference Publication

[1802.03471] Certified Robustness to Adversarial Examples with Differential Privacy

[1902.01350] F-BLEAU: Fast Black-box Leakage Estimation

18

[1804.00308] Manipulating Machine Learning: Poisoning Attacks and Countermeasures for Regression Learning

Improved Reconstruction Attacks on Encrypted Data Using Range Query Leakage - IEEE Conference Publication

AI2: Safety and Robustness Certification of Neural Networks with Abstract Interpretation - IEEE Conference Publication

17

[1608.04644] Towards Evaluating the Robustness of Neural Networks

16

[1608.04644] Towards Evaluating the Robustness of Neural Networks

[1511.04508] Distillation as a Defense to Adversarial Perturbations against Deep Neural Networks

AAAI16~20のadversarial examples関連論文リンク集

目視で判断したので、間違っていたり抜けてたりするかもしれませんが、ご容赦ください。

20

A New Ensemble Adversarial Attack Powered by Long-Term Gradient Memories | Proceedings of the AAAI Conference on Artificial Intelligence

ECGadv: Generating Adversarial Electrocardiogram to Misguide Arrhythmia Classification System | Proceedings of the AAAI Conference on Artificial Intelligence

A Frank-Wolfe Framework for Efficient and Effective Adversarial Attacks | Proceedings of the AAAI Conference on Artificial Intelligence

Optimal Attack against Autoregressive Models by Manipulating the Environment | Proceedings of the AAAI Conference on Artificial Intelligence

Seq2Sick: Evaluating the Robustness of Sequence-to-Sequence Models with Adversarial Examples | Proceedings of the AAAI Conference on Artificial Intelligence

Suspicion-Free Adversarial Attacks on Clustering Algorithms | Proceedings of the AAAI Conference on Artificial Intelligence

Improving the Robustness of Wasserstein Embedding by Adversarial PAC-Bayesian Learning | Proceedings of the AAAI Conference on Artificial Intelligence

Adversarially Robust Distillation | Proceedings of the AAAI Conference on Artificial Intelligence

Robust Stochastic Bandit Algorithms under Probabilistic Unbounded Adversarial Attack | Proceedings of the AAAI Conference on Artificial Intelligence

Robust Federated Learning via Collaborative Machine Teaching | Proceedings of the AAAI Conference on Artificial Intelligence

Spatiotemporally Constrained Action Space Attacks on Deep Reinforcement Learning Agents | Proceedings of the AAAI Conference on Artificial Intelligence

Robustness Certificates for Sparse Adversarial Attacks by Randomized Ablation | Proceedings of the AAAI Conference on Artificial Intelligence

Weighted-Sampling Audio Adversarial Example Attack | Proceedings of the AAAI Conference on Artificial Intelligence

Stealthy and Efficient Adversarial Attacks against Deep Reinforcement Learning | Proceedings of the AAAI Conference on Artificial Intelligence

Universal Adversarial Training | Proceedings of the AAAI Conference on Artificial Intelligence

CAG: A Real-Time Low-Cost Enhanced-Robustness High-Transferability Content-Aware Adversarial Attack Generator | Proceedings of the AAAI Conference on Artificial Intelligence

Adversarial Transformations for Semi-Supervised Learning | Proceedings of the AAAI Conference on Artificial Intelligence

Towards Certificated Model Robustness Against Weight Perturbations | Proceedings of the AAAI Conference on Artificial Intelligence

ML-LOO: Detecting Adversarial Examples with Feature Attribution | Proceedings of the AAAI Conference on Artificial Intelligence

CD-UAP: Class Discriminative Universal Adversarial Perturbation | Proceedings of the AAAI Conference on Artificial Intelligence

Towards Query-Efficient Black-Box Adversary with Zeroth-Order Natural Gradient Descent | Proceedings of the AAAI Conference on Artificial Intelligence

19

Resisting Adversarial Attacks Using Gaussian Mixture Variational Autoencoders | Proceedings of the AAAI Conference on Artificial Intelligence

AutoZOOM: Autoencoder-Based Zeroth Order Optimization Method for Attacking Black-Box Neural Networks | Proceedings of the AAAI Conference on Artificial Intelligence

Connecting the Digital and Physical World: Improving the Robustness of Adversarial Attacks | Proceedings of the AAAI Conference on Artificial Intelligence

Distributionally Adversarial Attack | Proceedings of the AAAI Conference on Artificial Intelligence

Knowledge Distillation with Adversarial Samples Supporting Decision Boundary | Proceedings of the AAAI Conference on Artificial Intelligence

The Curse of Concentration in Robust Learning: Evasion and Poisoning Attacks from Concentration of Measure | Proceedings of the AAAI Conference on Artificial Intelligence

Adversarial Dropout for Recurrent Neural Networks | Proceedings of the AAAI Conference on Artificial Intelligence

The Adversarial Attack and Detection under the Fisher Information Metric | Proceedings of the AAAI Conference on Artificial Intelligence

Sparse Adversarial Perturbations for Videos | Proceedings of the AAAI Conference on Artificial Intelligence

18

Resisting Adversarial Attacks Using Gaussian Mixture Variational Autoencoders | Proceedings of the AAAI Conference on Artificial Intelligence

AutoZOOM: Autoencoder-Based Zeroth Order Optimization Method for Attacking Black-Box Neural Networks | Proceedings of the AAAI Conference on Artificial Intelligence

Connecting the Digital and Physical World: Improving the Robustness of Adversarial Attacks | Proceedings of the AAAI Conference on Artificial Intelligence

Distributionally Adversarial Attack | Proceedings of the AAAI Conference on Artificial Intelligence

Knowledge Distillation with Adversarial Samples Supporting Decision Boundary | Proceedings of the AAAI Conference on Artificial Intelligence

The Curse of Concentration in Robust Learning: Evasion and Poisoning Attacks from Concentration of Measure | Proceedings of the AAAI Conference on Artificial Intelligence

Adversarial Dropout for Recurrent Neural Networks | Proceedings of the AAAI Conference on Artificial Intelligence

The Adversarial Attack and Detection under the Fisher Information Metric | Proceedings of the AAAI Conference on Artificial Intelligence

Sparse Adversarial Perturbations for Videos | Proceedings of the AAAI Conference on Artificial Intelligence

17

[1705.08378] Detecting Adversarial Image Examples in Deep Networks with Adaptive Noise Reduction

[1803.00401] Unravelling Robustness of Deep Learning based Face Recognition Against Adversarial Attacks

[1709.04114] EAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples

[1801.04693] Towards Imperceptible and Robust Adversarial Example Attacks against Neural Networks

[1711.09404] Improving the Adversarial Robustness and Interpretability of Deep Neural Networks by Regularizing their Input Gradients

Learning to Attack: Adversarial Transformation Networks

16

(PDF) Multi-Defender Strategic Filtering Against Spear-Phishing Attacks

[PDF] Data Poisoning Attacks against Autoregressive Models | Semantic Scholar

ICML18~20のadversarial examples関連論文リンク集

目視で判断したので、間違っていたり抜けてたりするかもしれませんが、ご容赦ください。

20

[1909.13806] Min-Max Optimization without Gradients: Convergence and Applications to Adversarial ML

[2004.13617] Adversarial Learning Guarantees for Linear Hypotheses and Neural Networks

[2003.03778] Adversarial Attacks on Probabilistic Autoregressive Forecasting Models

[2002.11569] Overfitting in adversarially robust deep learning

[2002.11798] Learning Adversarially Robust Representations via Worst-Case Mutual Information Maximization

[2002.04694] Adversarial Robustness for Code

[2008.02883] Stronger and Faster Wasserstein Adversarial Attacks

Attacks Which Do Not Kill Training Make Adversarial Learning Stronger

Adversarial Risk via Optimal Transport and Optimal Couplings

[2002.04599] Fundamental Tradeoffs between Invariance and Sensitivity to Adversarial Perturbations

[2006.14748] Proper Network Interpretability Helps Adversarial Robustness in Classification

[2006.16520] Black-box Certification and Learning under Adversarial Perturbations

[1909.04068] Adversarial Robustness Against the Union of Multiple Perturbation Models

[2007.11826] Hierarchical Verification for Adversarial Robustness

[2003.01690] Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks

[2006.16384] Sharp Statistical Guarantees for Adversarially Robust Gaussian Classification

[1906.07153] Adversarial attacks on Copyright Detection Systems

[2011.07478] Towards Understanding the Regularization of Adversarial Robustness on Neural Networks

[1810.06583] Concise Explanations of Neural Networks using Adversarial Training

[2002.11821] Improving Robustness of Deep-Learning-Based Image Reconstruction

[2002.04725] More Data Can Expand the Generalization Gap Between Adversarially Robust and Standard Models

[2003.10602] Defense Through Diverse Directions

[2002.11565] Randomization matters. How to defend against strong adversarial attacks

[1907.02044] Minimally distorted Adversarial Examples with a Fast Adaptive Boundary Attack

19

First-Order Adversarial Vulnerability of Neural Networks and Input Dimension

[1809.01093] Adversarial Attacks on Node Embeddings via Graph Poisoning

[1907.13220] Multi-Agent Adversarial Inverse Reinforcement Learning

[1901.08846] Improving Adversarial Robustness via Promoting Ensemble Diversity

[1903.06603] On Certifying Non-uniform Bound against Adversarial Attacks

[1904.00759] Adversarial camera stickers: A physical camera-based attack on deep learning systems

[1805.10204] Adversarial examples from computational constraints

[1905.07387] POPQORN: Quantifying Robustness of Recurrent Neural Networks

[1810.04065] Generalized No Free Lunch Theorem for Adversarial Robustness

[1902.10660] Robust Decision Trees Against Adversarial Examples

[1802.06552] Are Generative Classifiers More Robust to Adversarial Attacks?

[1902.02918] Certified Adversarial Robustness via Randomized Smoothing

[1903.10346] Imperceptible, Robust, and Targeted Adversarial Examples for Automatic Speech Recognition

[1905.06635] Parsimonious Black-Box Adversarial Attacks via Efficient Combinatorial Optimization

[1902.07906] Wasserstein Adversarial Examples via Projected Sinkhorn Iterations

[1905.05897] Transferable Clean-Label Poisoning Attacks on Deep Neural Nets

[1905.00441] NATTACK: Learning the Distributions of Adversarial Examples for an Improved Black-Box Attack on Deep Neural Networks

[1905.07121] Simple Black-box Adversarial Attacks

[1905.06494] Data Poisoning Attacks on Stochastic Bandits

Data Poisoning Attacks in Multi-Party Learning

Transferable Adversarial Training:A General Approach to Adapting Deep Classifiers

[1901.10513] Adversarial Examples Are a Natural Consequence of Test Error in Noise

[1905.09797] Interpreting Adversarially Trained Convolutional Neural Networks

[1806.02977] Monge blunts Bayes: Hardness Results for Adversarial Training

[1811.00007] Robustly Disentangled Causal Mechanisms: Validating Deep Representations for Interventional Robustness

[1905.04172] On the Connection Between Adversarial Robustness and Saliency Map Interpretability

[1906.11897] On Physical Adversarial Patches for Object Detection

18

Provable Defenses against Adversarial Examples via the Convex Outer Adversarial Polytope

Synthesizing Robust Adversarial Examples

Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples

Selecting Representative Examples for Program Synthesis

Adversarial Risk and the Dangers of Evaluating Against Weak Attacks

Black-box Adversarial Attacks with Limited Queries and Information

Analyzing the Robustness of Nearest Neighbors to Adversarial Examples