OPTICS (Ordering points to identify the clustering structure)演算法實現

阿新 • • 發佈：2019-01-06

本文依照《資料探勘概念與技術》第三版OPTICS演算法描述，同時借鑑了博主皮果提對OPTICS演算法的總結http://blog.csdn.net/itplus/article/details/10089323，以及本人之前縮寫的DBSCAN演算法http://blog.csdn.net/lengo/article/details/78700607綜合而來。個人覺得OPTICS演算法與DBSCAN演算法基本相似，不同之處在於增加了對可達距離的排序，利用有序的可達距離來表達不同簇的結構。本文的實驗資料量較大，請在網盤上下載：https://pan.baidu.com/s/1pLXFsJx。

以下是實現主程式：

clc;
clear;
%讀取檔案，前兩列為x,y座標資料
fileID = fopen('D:\matlabFile\OPTICS\optics.txt');
DS=textscan(fileID,'%f %f');
fclose(fileID);
%鄰域閾值
eps=0.5;
%密度閾值
minPts=5;
%將cell資料轉為矩陣形式
DB=cat(2,DS{1},DS{2});
% scatter(DB(:,1),DB(:,2),'filled');
%增加第三列，點的順序編號
ind=1:size(DB,1);
DB=cat(2,DB,ind');
%增加最後一列訪問標記,初始為未訪問0
Col=zeros(size(DB,1),1);
DB=cat(2,DB,Col);
%結果佇列，第一列為點的編號，第二列為該點的可達距離，第三列為類別序號
Outputs=zeros(1,3);
%離群佇列，儲存離群結果
Outliers=zeros(1,1);
DB_back=DB;
%簇編號
count=1;
while ~isempty(DB_back)
    Row=size(DB_back,1);
    %隨機選擇一個物件作為起始點
    index=randi(Row,1,1);
    p=DB_back(index,:);
    %確定目標是未被訪問過的
    if p(1,4)==0
        %將原始資料集中的目標物件標記為訪問過
        DB(p(1,3),4)=1;
        %將第index個元資料從備份資料集中刪除
        DB_back(index,:)=[];
        %計算目標物件p與資料集所有元素的可達距離，結果儲存在排序佇列,第一列為
        OrderSeeds=ReachedDist(p,DB,eps);
        %如果排序列表中的數量滿足密度閾值，則目標點為核心物件，加入輸出佇列
        if size(OrderSeeds,1)>=minPts-1
            %取得滿足密度閾值的物件
            Obj=OrderSeeds(minPts-1,:);
            %確定目標物件的核心距離
            coreDist=Obj(1,2);
            %輸出目標物件序號
            Outputs=cat(1,Outputs,[p(1,3),coreDist,count]);
            %遍歷排序佇列
            while ~isempty(OrderSeeds)               
                %依次取出排序佇列
                for i=1:size(OrderSeeds)
                    q=DB(OrderSeeds(i,1),:);
                    %判斷q是否被訪問
                    if q(1,4)==0
                        DB(OrderSeeds(i,1),4)=1;
                        %計算目標物件q與資料集所有元素的可達距離
                        subOrderSeeds=ReachedDist(q,DB,eps);
                        if size(subOrderSeeds,1)>minPts-1
                            %檢視結果佇列中是否有物件q，沒有則加入
                            r1=find(Outputs(:,1)==q(1,3));
                            if isempty(r1)
                                Outputs=cat(1,Outputs,[OrderSeeds(i,:),count]);
                            end
                            %判斷子排隊序列是否存在於排隊序列
                            for j=1:size(subOrderSeeds,1)                                
                                r2=find(Outputs(:,1)==subOrderSeeds(j,1)); 
                                %若不存在，則加入
                                if isempty(r2)
                                    %檢視是否在結果佇列中
                                    r3=find(OrderSeeds(:,1)==subOrderSeeds(j,1));
                                    if isempty(r3)
                                        OrderSeeds=cat(1,OrderSeeds,subOrderSeeds(j,:));
                                    end                                    
                                end
                            end
                            %從排序佇列中刪除
                            OrderSeeds(i,:)=[];
                            %對排序佇列按照可達距離重新排序
                            OrderSeeds=sortrows(OrderSeeds,2);
                            %刪除備份資料集中的物件q
                            r4=find(DB_back(:,3)==q(1,3));
                            if ~isempty(r4)
                                DB_back(r4,:)=[];
                            end
                            %結束for迴圈
                            break;
                        end
                    end 
                end
            end
            %簇編號加1
            count=count+1;
        else
            %將目標物件儲存到離群佇列
            Outliers=cat(1,Outliers,p(1,3));
        end
    end
end
%刪除第一行的零值
Outliers(1,:)=[];
Outputs(1,:)=[];
%顯示結果
x0=0;
%檢索分類序號
CID=unique(Outputs(:,3));
for i=1:length(CID)
    r5=find(Outputs(:,3)==CID(i));
    y=Outputs(r5,:);
    x=1+x0:size(y,1)+x0;
    c=[i*0.2 1-i*0.1 1-i*0.1];
    stem(x,y(:,2),'Marker','none','Color',c);
    x0=x0+size(y,1);
    hold on
end
hold off
xlabel('序號');
ylabel('可達距離');
ylim([0 0.7]);

ReachedDist函式實現如下：

function OrderSeeds=ReachedDist(point,Dataset,eps)
    OrderSeeds=zeros(1,2);
    for i=1:size(Dataset,1)
        %計算兩點之間的距離
        dist=sqrt((Dataset(i,1)-point(1))^2+(Dataset(i,2)-point(2))^2);
        %如在鄰域範圍內，加入排序佇列
        if dist==0
            continue;
        end
        if dist<=eps
            temp=[Dataset(i,3),dist];
            OrderSeeds=cat(1,OrderSeeds,temp);
        end
    end
    %刪除第一行零值
    OrderSeeds(1,:)=[];
    %如果是非空，則進行排序
    if ~isempty(OrderSeeds)
        %按照第三列距離值降序排列
        OrderSeeds=sortrows(OrderSeeds,2);
    end
end

OPTICS (Ordering points to identify the clustering structure)演算法實現

本文依照《資料探勘概念與技術》第三版OPTICS演算法描述，同時借鑑了博主皮果提對OPTICS演算法的總結http://blog.csdn.net/itplus/article/details/10089323，以及本人之前縮寫的DBSCAN演算法http://blog.c

The next phase: Using neural networks to identify gas

This breakthrough work has been recognized as a finalist for a 2018 R&D 100 award. R&D 100 awards, called the "Oscars of Innovation," are given ou

Xhorse bonus points,what is the use & How to collect?

vvdi BE key pro VVDI Key Tool vvdi universal remotes vvdi2 xhorse bonus points For some customers’ questions about xhorse bonus po

Most efficient way to get the last element of a stream

val lang ted reduce class ret return imp pretty Do a reduction that simply returns the current value:Stream<T> stream; T last = str

win 環境下報錯 Namespace declaration statement has to be the very first

ica found space all fopen 頭文件 bstr spa while 啟動 start_for_win.bat 的時候報錯 Namespace declaration statement has to be the very first ，這是由於 b

Namespace declaration statement has to be the very first statement in the script

ont 文件 space pac .net button php 解決 img php 中 Namespace declaration statement has to be the very first statement in the script 錯誤解決方

Tutorial: Generate BBox or Rectangle to locate the target obejct

mil 格式 trac documents rgb ims nts itl tro Tutorial: Generate BBox or Rectangle to locate the target obejct 　　 1 clc;close all;clear

安裝xcode6 beta 後調試出現Unable to boot the iOS Simulator以及編譯蘋果官方Swift的demo報錯failed with exit code 1的解決的方法

imu fonts mat 詳細說明 watermark data- 重新啟動技術分享說明蘋果昨天公布新語言Swift(雨燕),須要安裝xcode6 以及mac os 系統為10.9以上。（xcode6 beta 可在官方下載。須要登錄開發人員賬號；mac os

UVa 11995 - I Can Guess the Data Structure!

spa 實現 size end amp ins post bool ret 題目：給你一些數據結構上的操作，推斷該數據結構是棧、隊列、還是優先隊列。分析：0基礎DS，模擬。構建三種結構，直接模擬，然後依據結果推斷。說明：優先隊列用最大堆實現。 #include &l

How To View the HTML Source in Google Chrome

inner eve spi together member mes mnt line split Whether you are new to the web industry or a seasoned veteran, viewing the HTML source o

解決gpg failed to sign the data fatal: failed to write commit object解決方案

git錯誤 unset 下使用 bject mit 所有 als 網上一點今天有位新同事在comit代碼的時候一直報這個錯誤: gpg failed to sign the data fatal: failed to write commit object。

Failed to load the JNI shared library "XXXXXXX"

http library bsp red -1 技術分享 log fail 查看今天啟動Eclipse的時候出現了這個問題，經過查找，一般來說這種問題都是因為eclipse 和Java 的兼容性不一致所導致的。 1）查看Eclipse 和Java 版本那麽我們需要分

Vue入門之旅：一報錯 Unknown ... make sure to provide the "name" option及error compiling template

ont methods rect return tex exactly gist () 編譯報錯一： Unknown custom element: <custom-select> - did you register the component correc

Failed to install the hcmon driver

failed 分享 drive 虛擬 ges 文件 mage http 卸載在安裝虛擬機的時候出現“Failed to install the hcmon driver”錯誤，是之前VM沒有卸載幹凈，提供兩個參考解決方法： 1：在C盤的驅動文件夾也就是“C:\Windo

windows下啟動redis提示Invalid argument during startup: Failed to open the .conf file: redis.windows.connf CWD=C:UsersAdministrator

環境 ini nbsp invalid file users 但是 user src 環境：Windows 7 64bit 旗艦版 redis 3.2.100 64bit 開始的時候，redis運行的好好的，不過每次啟動都要要進入其路徑，挺麻煩的，

OPTICS (Ordering points to identify the clustering structure)演算法實現

OPTICS (Ordering points to identify the clustering structure)演算法實現

The next phase: Using neural networks to identify gas

Xhorse bonus points,what is the use & How to collect?

Most efficient way to get the last element of a stream

win 環境下報錯 Namespace declaration statement has to be the very first

Namespace declaration statement has to be the very first statement in the script

Tutorial: Generate BBox or Rectangle to locate the target obejct

安裝xcode6 beta 後調試出現Unable to boot the iOS Simulator以及編譯蘋果官方Swift的demo報錯failed with exit code 1的解決的方法

UVa 11995 - I Can Guess the Data Structure!

How To View the HTML Source in Google Chrome

解決gpg failed to sign the data fatal: failed to write commit object解決方案

Failed to load the JNI shared library "XXXXXXX"

Vue入門之旅：一報錯 Unknown ... make sure to provide the "name" option及error compiling template

Failed to install the hcmon driver

windows下啟動redis提示Invalid argument during startup: Failed to open the .conf file: redis.windows.connf CWD=C:UsersAdministrator

question 002: dev c++ 當中如何調整字體大小？How to get the first program with C++? c++屬於什麽軟件？

MyEclipse打開JSP文件報"Failed to create the part's controls"解決方法

Trump signed presidential directive ordering actions to pressure North Korea

[Javascript] Identify the most important words in a document using tf-idf in Natural

Uva 11995 I Can Guess the Data Structure!

OPTICS (Ordering points to identify the clustering structure)演算法實現

相關推薦