Skip to content

Commit ec6fe7e

Browse files
committed
DOC update README to forward to the documentation page
1 parent 3c2f10c commit ec6fe7e

File tree

1 file changed

+3
-88
lines changed

1 file changed

+3
-88
lines changed

README.rst

Lines changed: 3 additions & 88 deletions
Original file line numberDiff line numberDiff line change
@@ -154,92 +154,7 @@ One way of addressing this issue is by re-sampling the dataset as to offset this
154154
imbalance with the hope of arriving at a more robust and fair decision boundary
155155
than you would otherwise.
156156

157-
Re-sampling techniques are divided in following four categories:
158-
1. Under-sampling the majority class(es).
159-
2. Over-sampling the minority class.
160-
3. Combining over- and under-sampling.
161-
4. Create ensemble balanced sets.
162-
163-
Below is a list of the methods currently implemented in this module.
164-
165-
* Under-sampling
166-
1. Random majority under-sampling with replacement
167-
2. Extraction of majority-minority Tomek links [1]_
168-
3. Under-sampling with Cluster Centroids
169-
4. NearMiss-(1 & 2 & 3) [2]_
170-
5. Condensed Nearest Neighbour [3]_
171-
6. One-Sided Selection [4]_
172-
7. Neighboorhood Cleaning Rule [5]_
173-
8. Edited Nearest Neighbours [6]_
174-
9. Instance Hardness Threshold [7]_
175-
10. Repeated Edited Nearest Neighbours [14]_
176-
11. AllKNN [14]_
177-
178-
* Over-sampling
179-
1. Random minority over-sampling with replacement
180-
2. SMOTE - Synthetic Minority Over-sampling Technique [8]_
181-
3. SMOTENC - SMOTE for Nominal and Continuous [8]_
182-
4. SMOTEN - SMOTE for Nominal [8]_
183-
5. bSMOTE(1 & 2) - Borderline SMOTE of types 1 and 2 [9]_
184-
6. SVM SMOTE - Support Vectors SMOTE [10]_
185-
7. ADASYN - Adaptive synthetic sampling approach for imbalanced learning [15]_
186-
8. KMeans-SMOTE [17]_
187-
9. ROSE - Random OverSampling Examples [19]_
188-
189-
* Over-sampling followed by under-sampling
190-
1. SMOTE + Tomek links [12]_
191-
2. SMOTE + ENN [11]_
192-
193-
* Ensemble classifier using samplers internally
194-
1. Easy Ensemble classifier [13]_
195-
2. Balanced Random Forest [16]_
196-
3. Balanced Bagging
197-
4. RUSBoost [18]_
198-
199-
* Mini-batch resampling for Keras and Tensorflow
200-
201-
The different algorithms are presented in the sphinx-gallery_.
202-
203-
.. _sphinx-gallery: https://imbalanced-learn.readthedocs.io/en/stable/auto_examples/index.html
204-
205-
206-
References:
207-
-----------
208-
209-
.. [1] : I. Tomek, “Two modifications of CNN,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 6, pp. 769-772, 1976.
210-
211-
.. [2] : I. Mani, J. Zhang. “kNN approach to unbalanced data distributions: A case study involving information extraction,” In Proceedings of the Workshop on Learning from Imbalanced Data Sets, pp. 1-7, 2003.
212-
213-
.. [3] : P. E. Hart, “The condensed nearest neighbor rule,” IEEE Transactions on Information Theory, vol. 14(3), pp. 515-516, 1968.
214-
215-
.. [4] : M. Kubat, S. Matwin, “Addressing the curse of imbalanced training sets: One-sided selection,” In Proceedings of the 14th International Conference on Machine Learning, vol. 97, pp. 179-186, 1997.
216-
217-
.. [5] : J. Laurikkala, “Improving identification of difficult small classes by balancing class distribution,” Proceedings of the 8th Conference on Artificial Intelligence in Medicine in Europe, pp. 63-66, 2001.
218-
219-
.. [6] : D. Wilson, “Asymptotic Properties of Nearest Neighbor Rules Using Edited Data,” IEEE Transactions on Systems, Man, and Cybernetrics, vol. 2(3), pp. 408-421, 1972.
220-
221-
.. [7] : M. R. Smith, T. Martinez, C. Giraud-Carrier, “An instance level analysis of data complexity,” Machine learning, vol. 95(2), pp. 225-256, 2014.
222-
223-
.. [8] : N. V. Chawla, K. W. Bowyer, L. O. Hall, W. P. Kegelmeyer, “SMOTE: Synthetic minority over-sampling technique,” Journal of Artificial Intelligence Research, vol. 16, pp. 321-357, 2002.
224-
225-
.. [9] : H. Han, W.-Y. Wang, B.-H. Mao, “Borderline-SMOTE: A new over-sampling method in imbalanced data sets learning,” In Proceedings of the 1st International Conference on Intelligent Computing, pp. 878-887, 2005.
226-
227-
.. [10] : H. M. Nguyen, E. W. Cooper, K. Kamei, “Borderline over-sampling for imbalanced data classification,” In Proceedings of the 5th International Workshop on computational Intelligence and Applications, pp. 24-29, 2009.
228-
229-
.. [11] : G. E. A. P. A. Batista, R. C. Prati, M. C. Monard, “A study of the behavior of several methods for balancing machine learning training data,” ACM Sigkdd Explorations Newsletter, vol. 6(1), pp. 20-29, 2004.
230-
231-
.. [12] : G. E. A. P. A. Batista, A. L. C. Bazzan, M. C. Monard, “Balancing training data for automated annotation of keywords: A case study,” In Proceedings of the 2nd Brazilian Workshop on Bioinformatics, pp. 10-18, 2003.
232-
233-
.. [13] : X.-Y. Liu, J. Wu and Z.-H. Zhou, “Exploratory undersampling for class-imbalance learning,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 39(2), pp. 539-550, 2009.
234-
235-
.. [14] : I. Tomek, “An experiment with the edited nearest-neighbor rule,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 6(6), pp. 448-452, 1976.
236-
237-
.. [15] : H. He, Y. Bai, E. A. Garcia, S. Li, “ADASYN: Adaptive synthetic sampling approach for imbalanced learning,” In Proceedings of the 5th IEEE International Joint Conference on Neural Networks, pp. 1322-1328, 2008.
238-
239-
.. [16] : C. Chao, A. Liaw, and L. Breiman. "Using random forest to learn imbalanced data." University of California, Berkeley 110 (2004): 1-12.
240-
241-
.. [17] : Felix Last, Georgios Douzas, Fernando Bacao, "Oversampling for Imbalanced Learning Based on K-Means and SMOTE"
242-
243-
.. [18] : Seiffert, C., Khoshgoftaar, T. M., Van Hulse, J., & Napolitano, A. "RUSBoost: A hybrid approach to alleviating class imbalance." IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans 40.1 (2010): 185-197.
157+
You can refer to the `imbalanced-learn`_ documentation to find details about
158+
the implemented algorithms.
244159

245-
.. [19] : Menardi, G., Torelli, N.: "Training and assessing classification rules with unbalanced data", Data Mining and Knowledge Discovery, 28, (2014): 92–122
160+
.. _imbalanced-learn: https://imbalanced-learn.org/stable/user_guide.html

0 commit comments

Comments
 (0)