Description: Measures performance for breadth-first search on an undirected (random, k-way) graph. Two algorithms are implemented. The first is from
a team at Stanford, described in this paper: http://dl.acm.org/citation.cfm?id=1941590. The second is from UIUC and detailed here: http://dl.acm.org/citation.cfm?id=1837289. Some minor changes were made to these specifications for correctness and cross-platform compatibility.
Problem Sizes: (Graph vertices) - 1000, 10000, 100000, 1000000
Precision: N/A (graph data is unsigned int)
Includes PCIe Transfer Time: Yes, in [testName]_PCIe measurements
Parameters:
–graph_file instead of using a random graph, load one from a METIS file
–degree control the average degree of nodes, default 2
–algo select the algorithm to use, 1-IIT 2-UIUC
–dump-pl dump the path lengths to file
–source_vertex specify the vertex to start the search from, default 0
–global-barrier enable the use of a device global barrier in the UIUC algorithm. This can occasionally improve performance by reducing kernel launch overhead.
Specific Tests:
BFS - achieved memory bandwidth in
GB/s
BFS_total - runtime in seconds (kernels and pcie)
BFS_kernel_time - runtime of all kernels in seconds
BFS_teps - achieved bandwidth in traversed edges per second, a common metric for graph algorithms
BFS_visited_vertices Reports the number of vertices visited in the traversal, should match the number of vertices in the connected component which contains the source vertex.