1. 程式人生 > >Numpy陣列的排序與選擇:sort, argsort, partition, argpartition, searchsorted, lexsort等

Numpy陣列的排序與選擇:sort, argsort, partition, argpartition, searchsorted, lexsort等

1. numpy.sort

numpy.sort(a, axis=-1, kind='quicksort', order=None)
Return a sorted copy of an array.

Parameters:
a : array_like
Array to be sorted.

axis : int or None, optional
Axis along which to sort. If None, the array is flattened before sorting. The default is -1, which sorts along the last axis.

kind : {‘quicksort’, ‘mergesort’, ‘heapsort’, ‘stable’}, optional

Sorting algorithm. Default is ‘quicksort’.

order : str or list of str, optional
When a is an array with fields defined, this argument specifies which fields to compare first, second, etc. A single field can be specified as a string, and not all fields need be specified, but unspecified fields will still beused, in the order in which they come up in the dtype, to break ties.

Returns:

sorted_array : ndarray
Array of the same type and shape as a.

函式功能:返回沿指定軸排序的陣列副本。

引數 功能
a 待排序的陣列或類似陣列的物件
axis 待排序的軸
axis=None,將所有元素排序,返回一維陣列;
axis為其它整數值,則沿著指定軸排序,預設值-1,沿最後一個軸排序
kind 排序方法,可選,{‘quicksort’, ‘mergesort’, ‘heapsort’, ‘stable’}
order 用於指定欄位型別的結構化陣列排序的引數
>>> np.sort(
[[1,3,2],[9,7,8],[6,5,4]], axis=0) array([[1, 3, 2], [6, 5, 4], [9, 7, 8]]) >>> np.sort([[1,3,2],[9,7,8],[6,5,4]], axis=1) array([[1, 2, 3], [7, 8, 9], [4, 5, 6]]) >>> np.sort([[1,3,2],[9,7,8],[6,5,4]]) array([[1, 2, 3], [7, 8, 9], [4, 5, 6]]) >>> np.sort([[1,3,2],[9,7,8],[6,5,4]], axis=None) array([1, 2, 3, 4, 5, 6, 7, 8, 9])

2. numpy.argsort

numpy.argsort(a, axis=-1, kind='quicksort', order=None)
Returns the indices that would sort an array.

Perform an indirect sort along the given axis using the algorithm specified by the kind keyword. It returns an array of indices of the same shape as a that index data along the given axis in sorted order.

Parameters:
a : array_like
Array to sort.

axis : int or None, optional
Axis along which to sort. The default is -1 (the last axis). If None, the flattened array is used.

kind : {‘quicksort’, ‘mergesort’, ‘heapsort’, ‘stable’}, optional
Sorting algorithm.

order : str or list of str, optional
When a is an array with fields defined, this argument specifies which fields to compare first, second, etc. A single field can be specified as a string, and not all fields need be specified, but unspecified fields will still be used, in the order in which they come up in the dtype, to break ties.

Returns:
index_array : ndarray, int
Array of indices that sort a along the specified axis. If a is one-dimensional, a[index_array] yields a sorted a. More generally, np.take_along_axis(a, index_array, axis=a) always yields the sorted a, irrespective of dimensionality.

函式功能:返回沿指定軸對陣列進行排序的索引,即各元素在原陣列中的位置。

引數 功能
a 待排序的陣列或類似陣列的物件
axis 待排序的軸
axis=None,將所有元素排序,返回一維陣列;
axis為其它整數值,則沿著指定軸排序,預設值-1,沿最後一個軸排序
kind 排序方法,可選,{‘quicksort’, ‘mergesort’, ‘heapsort’, ‘stable’}
order 用於指定欄位型別的結構化陣列排序的引數
>>> np.argsort([[1,3,2],[9,7,8],[6,5,4]], axis=0)
array([[0, 0, 0],
       [2, 2, 2],
       [1, 1, 1]], dtype=int64)
>>> np.argsort([[1,3,2],[9,7,8],[6,5,4]], axis=1)
array([[0, 2, 1],
       [1, 2, 0],
       [2, 1, 0]], dtype=int64)
>>> np.argsort([[1,3,2],[9,7,8],[6,5,4]])
array([[0, 2, 1],
       [1, 2, 0],
       [2, 1, 0]], dtype=int64)
>>> np.argsort([[1,3,2],[9,7,8],[6,5,4]], axis=None)
array([0, 2, 1, 8, 7, 6, 4, 5, 3], dtype=int64)

3. numpy.partition

numpy.partition(a, kth, axis=-1, kind='introselect', order=None)
Return a partitioned copy of an array.

Creates a copy of the array with its elements rearranged in such a way that the value of the element in k-th position is in the position it would be in a sorted array. All elements smaller than the k-th element are moved before this element and all equal or greater are moved behind it. The ordering of the elements in the two partitions is undefined.

New in version 1.8.0.

Parameters:
a : array_like
Array to be sorted.

kth : int or sequence of ints
Element index to partition by. The k-th value of the element will be in its final sorted position and all smaller elements will be moved before it and all equal or greater elements behind it. The order of all elements in the partitions is undefined. If provided with a sequence of k-th it will partition all elements indexed by k-th of them into their sorted position at once.

axis : int or None, optional
Axis along which to sort. If None, the array is flattened before sorting. The default is -1, which sorts along the last axis.

kind : {‘introselect’}, optional
Selection algorithm. Default is ‘introselect’.

order : str or list of str, optional
When a is an array with fields defined, this argument specifies which fields to compare first, second, etc. A single field can be specified as a string. Not all fields need be specified, but unspecified fields will still be used, in the order in which they come up in the dtype, to break ties.

Returns:
partitioned_array : ndarray
Array of the same type and shape as a.

函式功能:返回陣列沿指定軸分割槽/部分排序的副本。排序後的陣列副本第k個位置的元素的值不小於其左邊元素的值,不大於其右邊元素的值。也就是說,返回的陣列副本的第k個位置的元素位於正確的有序位置。

引數 功能
a 待排序的陣列或類似陣列的物件
kth 第k個位置,可指定多個位置
axis 待排序的軸
axis=None,將所有元素排序,返回一維陣列;
axis為其它整數值,則沿著指定軸排序,預設值-1,沿最後一個軸排序
kind 排序方法,可選,{‘introselect’}
order 用於指定欄位的結構化陣列排序的引數
>>> a = np.array([ 8, 10,  3,  2, 14, 9,  7, 11,  0, 13, 6,  1,  5,  4, 12, 15])
>>> np.partition(a, 4)		# 第4個位置的元素位於正確有序元素,即元素4位於有序位置
array([ 2,  0,  1,  3,  4,  5,  6, 10,  7,  8,  9, 11, 13, 14, 12, 15])
>>> np.partition(a, (4,12))	# 第4和12個位置的元素位於正確有序位置
array([ 2,  0,  1,  3,  4,  5,  6, 10,  7,  8,  9, 11, 12, 14, 13, 15])
a = np.asarray([np.random.randint(0,20, 10) for _ in range(10)])
>>> a = np.asarray([np.random.randint(0,10, 10) for _ in range(5)])
>>> a
array([[1, 0, 2, 2, 0, 1, 7, 3, 6, 9],
       [9, 8, 0, 2, 8, 2, 0, 2, 6, 9],
       [2, 6, 3, 5, 5, 7, 7, 3, 2, 2],
       [3, 2, 7, 8, 5, 1, 2, 1, 3, 1],
       [4, 0, 1, 4, 3, 2, 2, 3, 9, 7]])
>>> np.partition(a, 2, axis=0)	# 對各列分割槽排序,結果第2行為分界線
array([[1, 0, 0, 2, 0, 1, 0, 1, 2, 1],
       [2, 0, 1, 2, 3, 1, 2, 2, 3, 2],
       [3, 2, 2, 4, 5, 2, 2, 3, 6, 7],
       [9, 6, 7, 8, 5, 7, 7, 3, 6, 9],
       [4, 8, 3, 5, 8, 2, 7, 3, 9, 9]])
>>> np.partition(a, 5, axis=1)		# 對各行分割槽排序,結果第5列為分界線
array([[0, 0, 1, 1, 2, 2, 3, 6, 7, 9],
       [2, 2, 0, 0, 2, 6, 8, 8, 9, 9],
       [2, 2, 2, 3, 3, 5, 5, 6, 7, 7],
       [1, 1, 1, 2, 2, 3, 3, 8, 7, 5],
       [0, 1, 2, 2, 3, 3, 4, 4, 9, 7]])

4. numpy.argpartition

numpy.argpartition(a, kth, axis=-1, kind='introselect', order=None)
Perform an indirect partition along the given axis using the algorithm specified by the kind keyword. It returns an array of indices of the same shape as a that index data along the given axis in partitioned order.

New in version 1.8.0.

Parameters:
a : array_like
Array to sort.

kth : int or sequence of ints
Element index to partition by. The k-th element will be in its final sorted position and all smaller elements will be moved before it and all larger elements behind it. The order all elements in the partitions is undefined. If provided with a sequence of k-th it will partition all of them into their sorted position at once.

axis : int or None, optiona
l Axis along which to sort. The default is -1 (the last axis). If None, the flattened array is used.

kind : {‘introselect’}, optional
Selection algorithm. Default is ‘introselect’

order : str or list of str, optional
When a is an array with fields defined, this argument specifies which fields to compare first, second, etc. A single field can be specified as a string, and not all fields need be specified, but unspecified fields will still be used, in the order in which they come up in the dtype, to break ties.

Returns:
index_array : ndarray, int
Array of indices that partition a along the specified axis. If a is one-dimensional, a[index_array] yields a partitioned a. More generally, np.take_along_axis(a, index_array, axis=a) always yields the partitioned a, irrespective of dimensionality.

函式功能:返回陣列沿指定軸分割槽/部分排序的索引陣列。索引陣列中位置k的索引對應的元素不小於位置k之前索引對應的元素,不大於位置k之後索引對應的元素。

引數 功能
a 待排序的陣列或類似陣列的物件
kth 第k個位置,可指定多個位置
axis 待排序的軸
axis=None,將所有元素排序,返回一維陣列;
axis為其它整數值,則沿著指定軸排序,預設值-1,沿最後一個軸排序
kind 排序方法,可選,{‘introselect’}
order 用於指定欄位的結構化陣列排序的引數
>>> a = np.array([ 8, 10,  3,  2, 14, 9,  7, 11,  0, 13, 6,  1,  5,  4, 12, 15])
>>> np.argpartition(a, 4)
array([ 3,  8, 11,  2, 13, 12, 10,  1,  6,  0,  5,  7,  9,  4, 14, 15],
      dtype=int64)
>>> np.argpartition(a, (4,12))
array([ 3,  8, 11,  2, 13, 12, 10,  1,  6,  0,  5,  7, 14,  4,  9, 15],
      dtype=int64)
>>> a[np.argpartition(a, (4,12))]	# 等價於np.partition(a, (4,12))
array([ 2,  0,  1,  3,  4,  5,  6, 10,  7,  8,  9, 11, 12, 14, 13, 15])

5. numpy.searchsorted

numpy.searchsorted(a, v, side='left', sorter=None)
Find indices where elements should be inserted to maintain order.

Find the indices into a sorted array a such that, if the corresponding elements in v were inserted before the indices, the order of a would be preserved.

Assuming that a is sorted:

side returned index i satisfies
left a[i-1] < v <= a[i]
right a[i-1] <= v < a[i]

Parameters:
a : 1-D array_like
Input array. If sorter is None, then it must be sorted in ascending order, otherwise sorter must be an array of indices that sort it.

v : array_like
Values to insert into a.

side : {‘left’, ‘right’}, optional
If ‘left’, the index of the first suitable location found is given. If ‘right’, return the last such index. If there is no suitable index, return either 0 or N (where N is the length of a).

sorter : 1-D array_like, optional
Optional array of integer indices that sort array a into ascending order. They are typically the result of argsort.

New in version 1.7.0.

Returns:
indices : array of ints
Array of insertion points with the same shape as v.

函式功能:給定升序陣列以及待插入元素,返回保持序列有序的插入位置。

引數 功能
a 升序的陣列或類似陣列的物件
v 待插入的值,可以插入多個值
side 插入值的優先位置
‘left’:a[i-1] < v <= a[i];‘right’:a[i-1] <= v < a[i]
sorter 將陣列a按升序排序的索引陣列,通常是argsort的結果陣列,可選
>>> np.searchsorted([1,2,3,4,5], 3)
2
>>> np.searchsorted([1,2,3,4,5], 3, side='right')
3
>>> np.searchsorted([1,2,3,4,5], [-10, 10, 2, 3])
array([0, 5, 1, 2])

6. numpy.lexsort

numpy.lexsort(keys, axis=-1)
Perform an indirect stable sort using a sequence of keys.

Given multiple sorting keys, which can be interpreted as columns in a spreadsheet, lexsort returns an array of integer indices that describes the sort order by multiple columns. The last key in the sequence is used for the primary sort order, the second-to-last key for the secondary sort order, and so on. The keys argument must be a sequence of objects that can be converted to arrays of the same shape. If a 2D array is provided for the keys argument, it’s rows are interpreted as the sorting keys and sorting is according to the last row, second last row etc.

Parameters:
keys : (k, N) array or tuple containing k (N,)-shaped sequences
The k different “columns” to be sorted. The last column (or row if keys is a 2D array) is the primary sort key.

axis : int, optio