# How can you trim the file size of ipynb which is so big that your kernel cannot open it?

I was working on multi-variate regression analysis. There were over 80 explanatory variables so I used the AIC (Akaike information criterion) with step function in order to reduce these. BTW, the AIC step function does not exist in Python so you have to write yourself.

qiita.com This guy wrote his own step function. I copied and pasted it in my ipynb.

```def step_aic(model, exog, endog, **kwargs):
"""
This select the best exogenous variables with AIC
Both exog and endog values can be either str or list.
(Endog list is for the Binomial family.)

Note: This adopt only "forward" selection

Args:
model: model from statsmodels.formula.api
exog (str or list): exogenous variables
endog (str or list): endogenous variables
kwargs: extra keyword argments for model (e.g., data, family)

Returns:
model: a model that seems to have the smallest AIC
"""

# convert exog, endog to list format
exog = np.r_[[exog]].flatten()
endog = np.r_[[endog]].flatten()
remaining = set(exog)
selected = []  # contains adopted candidates

# calculate AIC only for constants
formula_head = ' + '.join(endog) + ' ~ '
formula = formula_head + '1'
aic = model(formula=formula, **kwargs).fit().aic
print('AIC: {}, formula: {}'.format(round(aic, 3), formula))

current_score, best_new_score = np.ones(2) * aic

# adopt all elements, or ends the loop if the AIC will not be improved although adding any elements
while remaining and current_score == best_new_score:
scores_with_candidates = []
for candidate in remaining:

# calculate the AIC when adding the remained elements one by one
formula_tail = ' + '.join(selected + [candidate])
formula = formula_head + formula_tail
aic = model(formula=formula, **kwargs).fit().aic
print('AIC: {}, formula: {}'.format(round(aic, 3), formula))

scores_with_candidates.append((aic, candidate))

# adopt the elements that improved the AIC most as the best candidate
scores_with_candidates.sort()
scores_with_candidates.reverse()
best_new_score, best_candidate = scores_with_candidates.pop()

# if adding a candinate reduces the AIC, add it as the determined candidates
if best_new_score < current_score:
remaining.remove(best_candidate)
selected.append(best_candidate)
current_score = best_new_score

formula = formula_head + ' + '.join(selected)
print('The best formula: {}'.format(formula))
return model(formula, **kwargs).fit()
```

Here is the problem. "print('AIC: {}, formula: {}'.format(round(aic, 3), formula))" yeilds huge amount of text information on my notebook, which makes my file as big as 80 MB. Have you ever heard of 80 MB sized ipynb? Jupyter notebook cannot handle it and freezed. To solve this problem, you have to trim your ipynb. But how? Your local jupyter kernel cannot open it. I tried once to delete unnecessary part manually by opening ipynb in my editor (as a JSON file) but that forced me huge efforts.

My idea was to use Google Colab notebook. Colab can handle and open a big size file.

You can use Google Colab Notebooks for trimming outputs. I opened 83 MB ipynb file at Colab and Colab could handle it. From Colab GUI, you can choose the output cell you want to delete then get the file back to your local directory and reopen it. Eventually I trimmed the original file to the size of 2 MB in this way.