Skip to content

[SYSTEMDS-3951] MM Chain Optimization w/ Basic Average Estimator#2525

Open
ywcb00 wants to merge 5 commits into
apache:mainfrom
ywcb00:fix/hops/rewrite/sparse
Open

[SYSTEMDS-3951] MM Chain Optimization w/ Basic Average Estimator#2525
ywcb00 wants to merge 5 commits into
apache:mainfrom
ywcb00:fix/hops/rewrite/sparse

Conversation

@ywcb00

@ywcb00 ywcb00 commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Hi,
this PR changes the estimator for sparse mm chain rewrites from MNC to the metadata average estimator. Integration of MNC for these rewrites is missing the information about the underlying matrix histograms.
This PR also fixes the corresponding test, which previously only validated if there was any mm chain optimization.
Thank you and all the best,
David

ywcb00 added 4 commits June 29, 2026 13:20
…): delete content of method to get the input matrices since it does not work and replace it with a not implemented exception for now

chore(main/hops/rewrite/ProgramRewriteStatus.java): remove unused methods and member variables to make clear that the program rewrite status does not hold a variable map
… leaf with its metadata only

	change predicate method to check whether it is a leaf accordingly
… check if the mm chain was rewritten with respect to sparsity by checking for the respective keyword in the trace log

chore(test/AutomatedTestBase.java): add check if buffer is null when checking if a buffer contains a certain string
…stimator by the basic average estimator for matrix multiplication chain rewrites

	NOTE: Missing information about the matrix histograms of input data caused certain limitations for the implementation with the mnc estimator.
if( LOG.isTraceEnabled() ){
LOG.trace("mmchainopt [i="+(i+1)+",j="+(j+1)+"]: costs = "+dpMatrix[i][j]+", split = "+(split[i][j]+1));
LOG.trace("mmchainoptsparse [i="+(i+1)+",j="+(j+1)+"]: costs = "+dpMatrix[i][j]+", split = "+(split[i][j]+1));
System.out.println("mmchainoptsparse [i="+(i+1)+",j="+(j+1)+"]: costs = "+dpMatrix[i][j]+", split = "+(split[i][j]+1));

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Baunsgaard do you have any idea of how to validate this in the corresponding test without printing it to System.out?
Best would be to capture the LOG.trace one line before. I already enabled the trace log in the test class, do you know a possibility to capture it then?
Help is very much appreciated. Thank you in advance. :)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

… capture and validate the log output to remove the necessity of having a console print in the code
@ywcb00 ywcb00 changed the title [SYSTEMDS-3951][WIP] MM Chain Optimization w/ Basic Average Estimator [SYSTEMDS-3951] MM Chain Optimization w/ Basic Average Estimator Jul 2, 2026
@ywcb00

ywcb00 commented Jul 2, 2026

Copy link
Copy Markdown
Contributor Author

Thank you very much @Baunsgaard. This was exactly what I was searching for and I changed the test validation accordingly. :)

Does anybody know why the builtin.part2 java test resulted in a timeout and is this behavior related to the current PR? Test output states: Our tests should run faster than 1000 sec each

If this is not related to the PR, I would like to merge the changes soon.

All the best,
David

@Baunsgaard

Copy link
Copy Markdown
Contributor

Thank you very much @Baunsgaard. This was exactly what I was searching for and I changed the test validation accordingly. :)

Does anybody know why the builtin.part2 java test resulted in a timeout and is this behavior related to the current PR? Test output states: Our tests should run faster than 1000 sec each

If this is not related to the PR, I would like to merge the changes soon.

All the best, David

  1. You are welcome.

  2. The part2 seems like it got stuck with a threading error again, my hypothesis is Spark was closed by one thread, and another is waiting for response. **It is not related to your changes **. However, do you want to take a look at that suite to check if there is anywhere we close a spark session in an unsafe manner?

@Baunsgaard Baunsgaard left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

few Nits, otherwise LGTM

return Arrays.asList(new Object[][] {
// {rows, cols, sparsities, eps},
{1000, 300, new double[]{0.10d, 0.10d}, Math.pow(10, -10)},
// {2, 300, new double[]{0.005, 1}, Math.pow(10, -10)},

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you changed it to parameterized, enable the other test as well

Comment on lines +140 to +141
Assert.assertTrue(log_out.stream().anyMatch(
l -> l.getMessage().toString().contains("mmchainoptsparse")));

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to check more of the logging string.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

2 participants