Skip to content

Seaborn's pairplot is very slow when Matplotlib.interactive(true) #8

@mrkn

Description

@mrkn

On this Ruby script:

# pairplot.rb
require 'matplotlib'
Matplotlib.interactive(true) if ARGV[0] == 'interactive'
Matplotlib.rcParams['backend'] = 'module://ruby.matplotlib.backend_inline'
require 'matplotlib/pyplot'
require 'pandas'
sns = PyCall.import_module('seaborn')
df = Pandas.read_csv('data/raw/iris.csv', header: nil, names: [
  'sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'class'
])
t = Time.now
sns.pairplot(df, hue: 'class')
d = Time.now - t
puts "#{d} sec"
Matplotlib::Pyplot.savefig('pairplot.png')

sns.pairplot is very slow only if Matplotlib.interactive(true) is called.

$ ruby pairplot.rb
1.663664 sec
$ ruby pairplot.rb interactive
225.765168 sec

But on the following Python script:

# pairplot.py
import matplotlib
matplotlib.rcParams['backend'] = 'module://ipykernel.pylab.backend_inline'
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
import time
df = pd.read_csv('data/raw/iris.csv', header=None, names=[
  'sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'class'
])
t = time.time()
sns.pairplot(df, hue='class')
d = time.time() - t
print("{} sec".format(d))
plt.savefig('pairplot.png')

sns.pairplot consumes less than 2 secs even though matplotlib.interactive(True) is called.

$ python pairplot.py
1.7206127643585205 sec
$ python pairplot.py interactive
1.6132988929748535 sec

This may be a bug of the backend of matplotlib.rb.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions