skip to main content
10.5555/3692070.3694038guideproceedingsArticle/Chapter ViewAbstractPublication PagesicmlConference Proceedingsconference-collections
research-article

MOKD: cross-domain finetuning for few-shot classification via maximizing optimized kernel dependence

Published: 03 January 2025 Publication History

Abstract

In cross-domain few-shot classification, nearest centroid classifier (NCC) aims to learn representations to construct a metric space where few-shot classification can be performed by measuring the similarities between samples and the prototype of each class. An intuition behind NCC is that each sample is pulled closer to the class centroid it belongs to while pushed away from those of other classes. However, in this paper, we find that there exist high similarities between NCC-learned representations of two samples from different classes. In order to address this problem, we propose a bi-level optimization framework, maximizing optimized kernel dependence (MOKD) to learn a set of class-specific representations that match the cluster structures indicated by labeled data of the given task. Specifically, MOKD first optimizes the kernel adopted in Hilbert-Schmidt independence criterion (HSIC) to obtain the optimized kernel HSIC (opt-HSIC) that can capture the dependence more precisely. Then, an optimization problem regarding the opt-HSIC is addressed to simultaneously maximize the dependence between representations and labels and minimize the dependence among all samples. Extensive experiments on Meta-Dataset demonstrate that MOKD can not only achieve better generalization performance on unseen domains in most cases but also learn better data representation clusters. The project repository of MOKD is available at: https://github.com/tmlr-group/MOKD.

References

[1]
Bach, F. R. and Jordan, M. I. Kernel independent component analysis. Journal of Machine Learning Research (JMLR), 3(Jul):1-48, 2002.
[2]
Baik, S., Choi, M., Choi, J., Kim, H., and Lee, K. M. Meta-learning with adaptive hyperparameters. NeurIPS, 2020.
[3]
Bateni, P., Goyal, R., Masrani, V., Wood, F., and Sigal, L. Improved few-shot visual classification. In CVPR, 2020.
[4]
Chi, H., Liu, F., Yang, W., Lan, L., Liu, T., Han, B., Cheung, W., and Kwok, J. Tohan: A one-step approach towards few-shot hypothesis adaptation. NeurIPS, 2021.
[5]
Cimpoi, M., Maji, S., Kokkinos, I., Mohamed, S., and Vedaldi, A. Describing textures in the wild. CVPR, 2014.
[6]
Doersch, C., Gupta, A., and Zisserman, A. Crosstransformers: spatially-aware few-shot transfer. In NeurIPS, 2020.
[7]
Dvornik, N., Schmid, C., and Mairal, J. Selecting relevant features from a multi-domain representation for few-shot classification. In ECCV, 2020.
[8]
El Amri, M. R. and Marrel, A. More powerful hsic-based independence tests, extension to space-filling designs and functional data. International Journal for Uncertainty Quantification, 14(2), 2024.
[9]
Finn, C., Abbeel, P., and Levine, S. Model-agnostic meta-learning for fast adaptation of deep networks. In ICML, 2017.
[10]
Freidling, T., Poignard, B., Climente-González, H., and Yamada, M. Post-selection inference with hsic-lasso. In ICML, 2021.
[11]
Gretton, A., Bousquet, O., Smola, A., and Schölkopf, B. Measuring statistical dependence with hilbert-schmidt norms. In Algorithmic Learning Theory: 16th International Conference, ALT 2005, Singapore, October 8-11, 2005. Proceedings 16, pp. 63-77. Springer, 2005a.
[12]
Gretton, A., Herbrich, R., Smola, A., Bousquet, O., Schölkopf, B., et al. Kernel methods for measuring independence. Journal of Machine Learning Research (JMLR), 2005b.
[13]
He, K., Zhang, X., Ren, S., and Sun, J. Deep residual learning for image recognition. In CVPR, 2016.
[14]
Houben, S., Stallkamp, J., Salmen, J., Schlipsing, M., and Igel, C. Detection of traffic signs in real-world images: The german traffic sign detection benchmark. In IJCNN, 2013.
[15]
Jitkrittum, W., Szabó, Z., Chwialkowski, K. P., and Gretton, A. Interpretable distribution features with maximum testing power. NIPS, 2016.
[16]
Jongejan, J., Henry, R., Takashi, K., Kim, J., and Nick, F.-G. The quick, draw! a.i. experiment. https://quickdraw.withgoogle.com/, 2016.
[17]
Koyama, K., Kiritoshi, K., Okawachi, T., and Izumitani, T. Effective nonlinear feature selection method based on hsic lasso and with variational inference. In AISTATS, 2022.
[18]
Krizhevsky, A., Hinton, G., et al. Learning multiple layers of features from tiny images. Technical Report, 2009.
[19]
Kumagai, A., Iwata, T., Ida, Y., and Fujiwara, Y. Few-shot learning for feature selection with hilbert-schmidt independence criterion. In NeurIPS, 2022.
[20]
Kuzborskij, I. and Orabona, F. Stability and hypothesis transfer learning. In ICML, 2013.
[21]
Lake, B. M., Salakhutdinov, R., and Tenenbaum, J. B. Human-level concept learning through probabilistic program induction. Science, 350(6266):1332-1338, 2015.
[22]
LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278-2324, 1998.
[23]
Li, W.-H., Liu, X., and Bilen, H. Universal representation learning from multiple domains for few-shot classification. In ICCV, 2021a.
[24]
Li, W.-H., Liu, X., and Bilen, H. Cross-domain few-shot learning with task-specific adapters. In CVPR, 2022.
[25]
Li, Y., Pogodin, R., Sutherland, D. J., and Gretton, A. Self-supervised learning with kernel dependence maximization. In NeurIPS, 2021b.
[26]
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., and Zitnick, C. L. Microsoft coco: Common objects in context. In ECCV, 2014.
[27]
Liu, L., Hamilton, W., Long, G., Jiang, J., and Larochelle, H. A universal representation transformer layer for few-shot image classification. In ICLR, 2021a.
[28]
Liu, X., Zhang, J., Hu, T., Cao, H., Yao, Y., and Pan, L. Inducing neural collapse in deep long-tailed learning. 2023.
[29]
Liu, Y., Lee, J., Zhu, L., Chen, L., Shi, H., and Yang, Y. A multi-mode modulator for multi-domain few-shot classification. In CVPR, 2021b.
[30]
Maji, S., Rahtu, E., Kannala, J., Blaschko, M., and Vedaldi, A. Fine-grained visual classification of aircraft. arXiv preprint arXiv:1306.5151, 2013.
[31]
Nichol, A., Achiam, J., and Schulman, J. On first-order meta-learning algorithms. arXiv preprint arXiv:1803.02999, 2018.
[32]
Nilsback, M.-E. and Zisserman, A. Automated flower classification over a large number of classes. In 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing, pp. 722-729. IEEE, 2008.
[33]
Oord, A. v. d., Li, Y., and Vinyals, O. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748, 2018.
[34]
Perez, E., Strub, F., De Vries, H., Dumoulin, V., and Courville, A. Film: Visual reasoning with a general conditioning layer. In AAAI, 2018.
[35]
Qin, X., Song, X., and Jiang, S. Bi-level meta-learning for few-shot domain generalization. In CVPR, 2023.
[36]
Raghu, A., Raghu, M., Bengio, S., and Vinyals, O. Rapid learning or feature reuse? towards understanding the effectiveness of maml. ICLR, 2019.
[37]
Ravi, S. and Larochelle, H. Optimization as a model for few-shot learning. ICLR, 2017.
[38]
Requeima, J., Gordon, J., Bronskill, J., Nowozin, S., and Turner, R. E. Fast and flexible multi-task classification using conditional neural adaptive processes. NeurIPS, 2019.
[39]
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al. Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3): 211-252, 2015.
[40]
Schroeder, B. and Cui, Y. Fgvcx fungi classification challenge. github.com/visipedia/fgvcx_fungi_comp, 2018.
[41]
Serfling, R. J. Approximation theorems of mathematical statistics. John Wiley & Sons, 2009.
[42]
Snell, J., Swersky, K., and Zemel, R. Prototypical networks for few-shot learning. In NIPS, 2017.
[43]
Song, L., Smola, A., Gretton, A., Bedo, J., and Borgwardt, K. Feature selection via dependence maximization. Journal of Machine Learning Research (JMLR), 13:1393-1434, 2012.
[44]
Tian, Y., Wang, Y., Krishnan, D., Tenenbaum, J. B., and Isola, P. Rethinking few-shot image classification: a good embedding is all you need? In ECCV, 2020.
[45]
Triantafillou, E., Zhu, T., Dumoulin, V., Lamblin, P., Evci, U., Xu, K., Goroshin, R., Gelada, C., Swersky, K., Manzagol, P.-A., et al. Meta-dataset: A dataset of datasets for learning to learn from few examples. In ICLR, 2020.
[46]
Triantafillou, E., Larochelle, H., Zemel, R., and Dumoulin, V. Learning a universal template for few-shot dataset generalization. In ICML, 2021.
[47]
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I. Attention is all you need. NIPS, 2017.
[48]
Vinyals, O., Blundell, C., Lillicrap, T., Wierstra, D., et al. Matching networks for one shot learning. NIPS, 2016.
[49]
Wah, C., Branson, S., Welinder, P., Perona, P., and Belongie, S. The caltech-ucsd birds-200-2011 dataset. Technical Report CNS-TR-2011-001, 2011.
[50]
Wang, T. and Isola, P. Understanding contrastive representation learning through alignment and uniformity on the hypersphere. 2020.
[51]
Yamada, M., Jitkrittum, W., Sigal, L., Xing, E. P., and Sugiyama, M. High-dimensional feature selection by feature-wise kernelized lasso. Neural computation, 26(1): 185-207, 2014.
[52]
Yang, Z., Xu, Q., Bao, S., Cao, X., and Huang, Q. Learning with multiclass auc: Theory and algorithms. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44 (11):7747-7763, 2021.
[53]
Zeiler, M. D. Adadelta: an adaptive learning rate method. arXiv preprint arXiv:1212.5701, 2012.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
ICML'24: Proceedings of the 41st International Conference on Machine Learning
July 2024
63010 pages

Publisher

JMLR.org

Publication History

Published: 03 January 2025

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Acceptance Rates

Overall Acceptance Rate 140 of 548 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Jan 2025

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media