Python程式建立MongoDB資料庫集合的唯一索引

阿新 • • 發佈：2018-12-03

可以使用ensure_index或者create_index方法，兩種方法語法相同。

首先，連線資料庫中的目標集合：

col = MongoClient(the_client).get_database(the_db).get_collection(the_col)

然後，建立唯一索引，不加unique的話預設是普通的索引，即unique=False:

col.create_index([("索引欄位名", 1)], unique=True)

其中的1和-1分別表示正序與負序排列。注意，索引要用中括號——[ ("索引",1)]，具體參見下方原始碼說明。

例項：

# -*- coding:utf-8 -*-
# 給mongodb集合建立索引
from pymongo import MongoClient


def create_mongodb_index(the_data_client, the_data_db, the_data_cl, index_name, unique=False):
    data_client = MongoClient(the_data_client)
    data_db = data_client.get_database(the_data_db)
    data_col = data_db.get_collection(the_data_cl)

    print "start, the index is:", index_name

    data_col.ensure_index([(index_name, 1)], unique=unique)
    print "run over"


if __name__ == '__main__':
    DataClient = ''
    DataDB = ''
    DataCol = ''

    IndexName = ''

    create_mongodb_index(DataClient, DataDB, DataCol, index_name=IndexName, unique=False)

附：原始碼

    def create_index(self, keys, session=None, **kwargs):
        """Creates an index on this collection.

        Takes either a single key or a list of (key, direction) pairs.
        The key(s) must be an instance of :class:`basestring`
        (:class:`str` in python 3), and the direction(s) must be one of
        (:data:`~pymongo.ASCENDING`, :data:`~pymongo.DESCENDING`,
        :data:`~pymongo.GEO2D`, :data:`~pymongo.GEOHAYSTACK`,
        :data:`~pymongo.GEOSPHERE`, :data:`~pymongo.HASHED`,
        :data:`~pymongo.TEXT`).

        To create a single key ascending index on the key ``'mike'`` we just
        use a string argument::

          >>> my_collection.create_index("mike")

        For a compound index on ``'mike'`` descending and ``'eliot'``
        ascending we need to use a list of tuples::

          >>> my_collection.create_index([("mike", pymongo.DESCENDING),
          ...                             ("eliot", pymongo.ASCENDING)])

        All optional index creation parameters should be passed as
        keyword arguments to this method. For example::

          >>> my_collection.create_index([("mike", pymongo.DESCENDING)],
          ...                            background=True)

        Valid options include, but are not limited to:

          - `name`: custom name to use for this index - if none is
            given, a name will be generated.
          - `unique`: if ``True`` creates a uniqueness constraint on the index.
          - `background`: if ``True`` this index should be created in the
            background.
          - `sparse`: if ``True``, omit from the index any documents that lack
            the indexed field.
          - `bucketSize`: for use with geoHaystack indexes.
            Number of documents to group together within a certain proximity
            to a given longitude and latitude.
          - `min`: minimum value for keys in a :data:`~pymongo.GEO2D`
            index.
          - `max`: maximum value for keys in a :data:`~pymongo.GEO2D`
            index.
          - `expireAfterSeconds`: <int> Used to create an expiring (TTL)
            collection. MongoDB will automatically delete documents from
            this collection after <int> seconds. The indexed field must
            be a UTC datetime or the data will not expire.
          - `partialFilterExpression`: A document that specifies a filter for
            a partial index.
          - `collation` (optional): An instance of
            :class:`~pymongo.collation.Collation`. This option is only supported
            on MongoDB 3.4 and above.

        See the MongoDB documentation for a full list of supported options by
        server version.

        .. warning:: `dropDups` is not supported by MongoDB 3.0 or newer. The
          option is silently ignored by the server and unique index builds
          using the option will fail if a duplicate value is detected.

        .. note:: `partialFilterExpression` requires server version **>= 3.2**

        .. note:: The :attr:`~pymongo.collection.Collection.write_concern` of
           this collection is automatically applied to this operation when using
           MongoDB >= 3.4.

        :Parameters:
          - `keys`: a single key or a list of (key, direction)
            pairs specifying the index to create
          - `session` (optional): a
            :class:`~pymongo.client_session.ClientSession`.
          - `**kwargs` (optional): any additional index creation
            options (see the above list) should be passed as keyword
            arguments

        .. versionchanged:: 3.6
           Added ``session`` parameter. Added support for passing maxTimeMS
           in kwargs.
        .. versionchanged:: 3.4
           Apply this collection's write concern automatically to this operation
           when connected to MongoDB >= 3.4. Support the `collation` option.
        .. versionchanged:: 3.2
            Added partialFilterExpression to support partial indexes.
        .. versionchanged:: 3.0
            Renamed `key_or_list` to `keys`. Removed the `cache_for` option.
            :meth:`create_index` no longer caches index names. Removed support
            for the drop_dups and bucket_size aliases.

        .. mongodoc:: indexes
        """
        keys = helpers._index_list(keys)
        name = kwargs.setdefault("name", helpers._gen_index_name(keys))
        cmd_options = {}
        if "maxTimeMS" in kwargs:
            cmd_options["maxTimeMS"] = kwargs.pop("maxTimeMS")
        self.__create_index(keys, kwargs, session, **cmd_options)
        return name

    def __create_index(self, keys, index_options, session, **kwargs):
        """Internal create index helper.

        :Parameters:
          - `keys`: a list of tuples [(key, type), (key, type), ...]
          - `index_options`: a dict of index options.
          - `session` (optional): a
            :class:`~pymongo.client_session.ClientSession`.
        """
        index_doc = helpers._index_document(keys)
        index = {"key": index_doc}
        collation = validate_collation_or_none(
            index_options.pop('collation', None))
        index.update(index_options)

        with self._socket_for_writes() as sock_info:
            if collation is not None:
                if sock_info.max_wire_version < 5:
                    raise ConfigurationError(
                        'Must be connected to MongoDB 3.4+ to use collations.')
                else:
                    index['collation'] = collation
            cmd = SON([('createIndexes', self.name), ('indexes', [index])])
            cmd.update(kwargs)
            self._command(
                sock_info, cmd, read_preference=ReadPreference.PRIMARY,
                codec_options=_UNICODE_REPLACE_CODEC_OPTIONS,
                write_concern=self._write_concern_for(session),
                session=session)

    def ensure_index(self, key_or_list, cache_for=300, **kwargs):
        """**DEPRECATED** - Ensures that an index exists on this collection.

        .. versionchanged:: 3.0
            **DEPRECATED**
        """
        warnings.warn("ensure_index is deprecated. Use create_index instead.",
                      DeprecationWarning, stacklevel=2)
        # The types supported by datetime.timedelta.
        if not (isinstance(cache_for, integer_types) or
                isinstance(cache_for, float)):
            raise TypeError("cache_for must be an integer or float.")

        if "drop_dups" in kwargs:
            kwargs["dropDups"] = kwargs.pop("drop_dups")

        if "bucket_size" in kwargs:
            kwargs["bucketSize"] = kwargs.pop("bucket_size")

        keys = helpers._index_list(key_or_list)
        name = kwargs.setdefault("name", helpers._gen_index_name(keys))

        # Note that there is a race condition here. One thread could
        # check if the index is cached and be preempted before creating
        # and caching the index. This means multiple threads attempting
        # to create the same index concurrently could send the index
        # to the server two or more times. This has no practical impact
        # other than wasted round trips.
        if not self.__database.client._cached(self.__database.name,
                                              self.__name, name):
            self.__create_index(keys, kwargs, session=None)
            self.__database.client._cache_index(self.__database.name,
                                                self.__name, name, cache_for)
            return name
        return None

Python程式建立MongoDB資料庫集合的唯一索引

可以使用ensure_index或者create_index方法，兩種方法語法相同。首先，連線資料庫中的目標集合： col = MongoClient(the_client).get_database(the_db).get_collection(the_col) 然後，建立唯一

java操作mongodb指定集合及索引建立spring boot

package com.paic.aims.farmer.farmerinfo.dto; import io.swagger.annotations.ApiModel; import io.swagg

python連線操作mongodb資料庫

import pymongo #若沒有該模組,進入cmd, pip install pymongo #localhost表示本機ip,也可以用迴環地址127.0.0.1 而mongodb預設port是27017 mongoclient = pymongo.M

資料庫的唯一索引

唯一索引是不允許表中任何兩行具有相同索引值的索引。當現有的資料中存在重複的鍵值時，大多數資料庫不允許把新建立的唯一索引與表一起儲存。資料庫還可能防止新增將在表中建立重複鍵值的新資料。主鍵索引資料庫表經常有一列或列組合，其值唯一標識表中的每一行。該列稱為表的主鍵。在資

python 批量匯入MongoDB資料庫

由於原始檔不是 MongoDB 支援的JSON 和 BSON 格式的資料，所以只能將源資料轉變格式後，用指令碼將其匯入，所測資料為中科院信工所提供的資料。（這也是在信工所接觸的第一個寫程式的活。）原始碼如下： #!/usr/bin/env python #e

MongoDB 資料庫建立刪除、表（集合）建立刪除、資料增刪改查

使用資料庫、建立資料庫 use student 如果真的想把這個資料庫建立成功，那麼必須插入一個數據。資料庫中不能直接插入資料，只能往集合(collections)中插入資料。不需要專門建立集合，只需要寫點語法插入資料就會建立集合插入資料： db.student.insert({“name”:”x

資料庫SQL實踐37：對first_name建立唯一索引uniq_idx_firstname

思路：用create函式建立唯一索引和普通索引。 create unique index uniq_idx_firstname on actor(first_name); create index idx_lastname on actor(last_name); mysql資料庫還可

Python中動態建立Mongodb集合

一、場景簡述筆者最近在寫相關爬蟲資料儲存，需要根據店鋪名稱來建立mongodb中的集合，以下就是筆者的解決辦法二、解決方案 #mongodb連線 client = pymongo.MongoClient(mongo_util.mongo_conf['host'],mongo_

MongoDB 資料庫建立刪除、表（集合）建立刪除、資料增刪改查

資料庫使用開啟 mongodb 服務：要管理資料庫，必須先開啟服務，開啟服務使用 mongod --dbpath D:\mongodb 管理 mongodb 資料庫：mongo (一定要在新的 cmd 中輸入) 清屏： cls

MongoDB資料庫最多可建立多少集合？

預設情況下，MongoDB 的每個資料庫的名稱空間儲存在一個 16MB 的 .ns 檔案中，平均每個命名佔用約 628 位元組，也即整個資料庫的名稱空間的上限約為 24000。每一個集合、索引都將佔用

MongoDB（一）：如何在Python中動態建立Mongodb集合

前言：最近在寫爬蟲程式，需要將爬取的內容，存進Mongodb資料庫。但是需要在程式中根據變數動態建立集合。找遍的百度也沒有特別合適方法。最終經過實驗找到了一個非常簡便的方法。問題：在Python程式中動態建立集合。解決方法：圖片中，箭頭所示就

python實踐——批量統計mongodb資料庫的集合大小

#!/usr/bin/env python import os,sys list = [] for i in range(3,50): l = os.popen("/bin/echo 'show

MongoDB效能篇－建立索引，組合索引，唯一索引，刪除索引和explain執行計劃

一、索引 MongoDB 提供了多樣性的索引支援，索引資訊被儲存在system.indexes 中，且預設總是為_id建立索引，它的索引使用基本和MySQL 等關係型資料庫一樣。其實可以這樣說說，索引是凌駕於資料儲存系統之上的另一層系統，所以各種結構迥異的儲存都有相同或

mongodb 建立唯一索引，去除重複資料

如果建立唯一索引的時候，有資料重複，則會報錯，所以可以通過以下方法間接解決： 1.將資料匯出json格式 ./mongoexport -d liuniu -c tWechatMessage -o tWechatMessage.json -d 資料庫名 -c 集合名 -

MongoDB 建立基礎索引、組合索引、唯一索引以及優化

一、索引 MongoDB 提供了多樣性的索引支援，索引資訊被儲存在system.indexes 中，且預設總是為_id建立索引，它的索引使用基本和MySQL 等關係型資料庫一樣。其實可以這樣說說，索引是凌駕於資料儲存系統之上的另一層系統，所以各種結構迥異的儲存

4.非關系型數據庫（Nosql）之mongodb：普通索引，唯一索引

log 索引 xpl sys watermark lang mon style gravity 一：普通索引 1創建一個新的數據庫 > use toto; switched to db toto

為什麼你建立的資料庫索引沒有生效?

幾乎所有的小夥伴都可以隨口說幾句關於建立索引的優缺點，也知道什麼時候建立索引能夠提高我們的查詢效能，什麼時候索引會更新，但是你有沒有注意到，即使你設定了索引，有些時候索引他是不會生效的！這不僅考察了大家對索引的瞭解程度，還要讓大家在使用的時候能夠正確的使用。以下介紹了一些可能會造成索引失效的特殊情況，希望大家

mysql索引、主鍵、唯一索引、聯合索引的區別（索引的建立原則和注意事項）

索引對資料庫效能的影響？本質：縮小查詢範圍。大大減少需要掃描的資料量。大大提高查詢的速度，降低寫的速度，佔用磁碟。將隨機I/O變成順序I/O 特大的表怎麼解決查詢問題？分割槽。主鍵索引和唯一索引的區別？一個表只能有一個主鍵索引，但可以有多個唯一索引，主鍵索引是唯一索

python呼叫mongodb資料庫方法

自己實現的python呼叫mongodb資料庫方法，可支援一鍵匯入本地資料庫，一鍵寫入json檔案，並將查詢的複雜式子簡化。 from pymongo import * import json import sys import time import datetime '''

37. 對first_name建立唯一索引uniq_idx_firstname

題目描述針對如下表actor結構建立索引： CREATE TABLE IF NOT EXISTS actor ( actor_id smallint(5) NOT NULL PRIMARY KEY, first_name varchar(45) NOT NULL, last_name varc

Python程式建立MongoDB資料庫集合的唯一索引

相關推薦